<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Decoding Digital Anomalies</title>
  
  <subtitle>Sometimes the feature is the bug in the digital rabbit hole, and vice versa</subtitle>
  <link href="https://neo01.com/atom.xml" rel="self"/>
  
  <link href="https://neo01.com/"/>
  <updated>2026-04-01T16:29:39.547Z</updated>
  <id>https://neo01.com/</id>
  
  <author>
    <name>Neo Alienson</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>你的文档计时炸弹：为什么 2029 年会迫使您改变（为什么您会感谢我们）</title>
    <link href="https://neo01.com/zh-CN/2026/04/Document-as-Code-Why-Markdown-Git-Confluence-Word/"/>
    <id>https://neo01.com/zh-CN/2026/04/Document-as-Code-Why-Markdown-Git-Confluence-Word/</id>
    <published>2026-03-31T16:00:00.000Z</published>
    <updated>2026-04-01T16:29:39.547Z</updated>
    
    <content type="html"><![CDATA[<p>想象一下：凌晨三点。您的生产系统发生严重事故。值班工程师抓起笔记本电脑，打开操作手册……然后遇到登录画面。没有网络。没有 VPN。没有访问权限。</p><p>与此同时，程序的「最终」版本存放在某人 2023 年桌上的 Word 文件中。Confluence 页面？过期了。Wiki？没人知道密码。</p><p>这不是假设性的噩梦。对于数千个仍然将文档视为<strong>目的地</strong>而非<strong>交付物</strong>的团队来说，这是家常便饭。</p><p>令人不安的事实是：<strong>您的文档策略是单一故障点。</strong> 而在 2029 年，当 Atlassian 永久关闭 Confluence 本地版时，这个故障将成为强制性的。</p><p>但有一条出路。它叫做<strong>文档即代码</strong>（Document as Code，DaC）。不，这不仅仅是「用 Markdown 写作」。这是团队思考知识方式的根本转变。</p><hr /><h2 id="1-问题：文档坟墓">1 问题：文档坟墓</h2><p>让我们点明房间里的大象。</p><h3 id="Word-文档墓地">Word 文档墓地</h3><pre class="language-none"><code class="language-none">📁 共享硬盘&#x2F;  📁 Operations&#x2F;    📄 Runbook_FINAL.docx    📄 Runbook_FINAL_v2.docx    📄 Runbook_FINAL_v2_UPDATED.docx    📄 Runbook_FINAL_v3_ACTUAL_FINAL.docx    📄 Runbook_FINAL_v3_ACTUAL_FINAL_REALLY.docx</code></pre><p><strong>现实情况：</strong></p><ul><li>没人知道哪个版本是权威的</li><li>变更需要「追踪修订」和电子邮件往来</li><li>搜索？祝你好运</li><li>访问控制？要么每个人都有，要么完全没有</li></ul><h3 id="Confluence-陷阱">Confluence 陷阱</h3><p>Confluence 承诺提供有组织、可搜索的知识。但它带来的是：</p><table><thead><tr><th>问题</th><th>影响</th></tr></thead><tbody><tr><td><strong>厂商锁定</strong></td><td>您的知识存放在专有格式中</td></tr><tr><td><strong>需要登录</strong></td><td>没有网络和凭证就无法阅读文档</td></tr><tr><td><strong>搜索……很乐观</strong></td><td>找到正确的页面感觉像考古</td></tr><tr><td><strong>2029 年终止</strong></td><td>本地版结束支持。要么 SaaS，要么什么都没有。</td></tr></tbody></table><div class="admonition warning"><p class="admonition-title"><span class="mdi mdi-alert-outline admonition-icon"></span>⚠️ 2029 年最后期限</p><div class="admonition-content"><p>Atlassian 已宣布<strong>Confluence Data Center 将于 2029 年 3 月 28 日终止</strong>。之后：</p><ul><li>授权过期，环境变为<strong>只读</strong></li><li>没有安全补丁或错误修正</li><li>没有技术支持</li><li><strong>您可以查看数据但无法编辑或新增内容</strong></li><li>强烈不建议在连接至互联网时以只读模式执行（无安全更新）</li></ul><p><strong>时间表：</strong></p><ul><li><strong>2026 年 3 月 30 日</strong>：新客户无法再购买 Data Center 订阅</li><li><strong>2028 年 3 月 30 日</strong>：现有客户无法再购买新订阅或扩展</li><li><strong>2029 年 3 月 28 日</strong>：所有 Data Center 订阅过期</li></ul><p>对于具有合规性、数据主权或气隙环境的企业来说，这不是升级——这是最后通牒。延长维护可能需要例外协商，但需要直接与 Atlassian 谈判。</p></div></div><h3 id="Wiki-的狂野西部">Wiki 的狂野西部</h3><p>Wiki 始于「每个人都可以编辑！」，变成了「没人拥有这个。」</p><pre class="language-none"><code class="language-none">🌐 内部 Wiki  ├── 📄 入门指南（最后更新：2021 年）  ├── 📄 架构概览（图片损坏）  ├── 📄 值班程序（密码：？？？）  └── 📄 [404 页面未找到]</code></pre><p><strong>模式：</strong> 所有这三种方法都共享同一个致命缺陷——<strong>文档与工作分离</strong>。</p><hr /><h2 id="2-什么是文档即代码？">2 什么是文档即代码？</h2><p><strong>文档即代码</strong>将文档视为软件：</p><table><thead><tr><th>软件开发</th><th>文档即代码</th></tr></thead><tbody><tr><td>代码在 Git 中</td><td>文档在 Git 中</td></tr><tr><td>变更的拉取请求</td><td>编辑的拉取请求</td></tr><tr><td>代码审查</td><td>内容审查</td></tr><tr><td>CI/CD 管线</td><td>构建与部署管线</td></tr><tr><td>版本标签</td><td>发布版本</td></tr><tr><td>回滚能力</td><td>完整历史记录，即时还原</td></tr></tbody></table><p>但 DaC 与「仅使用 Markdown」的不同之处在于：</p><h3 id="不仅仅是-Markdown。是-Git。">不仅仅是 Markdown。是 Git。</h3><pre class="language-none"><code class="language-none">❌ 「我们使用 Markdown」→ 共享硬盘上的文件✅ 「我们使用文档即代码」→ 基于 Git 的工作流程，具有版本控制</code></pre><h3 id="什么是-Git？（给非技术读者）">什么是 Git？（给非技术读者）</h3><p><strong>Git</strong> 是一个随时间追踪文件变更的工具。把它想象成<strong>文档的时光机</strong>。</p><p>每次您保存变更时，Git 都会拍摄快照。您以后可以回到任何快照——昨天的版本、上周的，甚至一年前的。没有什么会丢失。</p><h3 id="为什么-Git-被建立">为什么 Git 被建立</h3><pre class="language-none"><code class="language-none">问题（Git 之前）：  👤 人员 A：「我正在编辑文件！」  👤 人员 B：「我也是！」  → 两人都保存 → 一个人的变更为丢失 😱解决方案（使用 Git）：  👤 人员 A：「我在自己的副本上编辑」  👤 人员 B：「我也在自己的副本上编辑」  → 两人都完成 → Git 安全地合并变更 ✅</code></pre><h3 id="现实类比">现实类比</h3><p><strong>Google 文档版本历史</strong> 的工作原理类似——每次保存都会记录谁变更了什么。Git 也这样做，但有三个关键差异：</p><ol><li><strong>离线运作</strong> — 不需要互联网</li><li><strong>电脑上的完整副本</strong> — 每个版本，永远</li><li><strong>没有厂商锁定</strong> — 您的数据保持属于您</li></ol><h3 id="这对您意味着什么">这对您意味着什么</h3><ul><li><strong>没有互联网？</strong> 没问题。一切都是本地的。</li><li><strong>服务器宕机？</strong> 您有完整的备份。</li><li><strong>厂商消失？</strong> 您的数据是您的。</li><li><strong>犯了错误？</strong> 即时还原到任何时间点。</li></ul><h3 id="Git-很难学吗？">Git 很难学吗？</h3><p>对开发者来说，Git 是日常工作——他们已经知道了。</p><p>对非技术用户来说，您不需要学习 Git 命令。现代工具（GitHub Web UI、AI 助理）处理复杂性。您只需编辑——Git 在后台运作。</p><h3 id="安全超能力：Git-就像文档的区块链">安全超能力：Git 就像文档的区块链</h3><p>这是强大的部分：<strong>Git 使用密码学使历史记录防篡改</strong>。</p><p>每次变更都会获得一个独特的指纹（称为「哈希」）。这个指纹是根据以下内容计算的：</p><ul><li>您变更的内容</li><li>上次变更的指纹</li><li>谁进行了变更以及何时</li></ul><p>这创建了一个<strong>指纹链</strong>——就像区块链一样。如果有人试图篡改历史记录（比如删除谁批准变更的证据），指纹就不再匹配。篡改行为<strong>会立即被检测到</strong>。</p><p>Confluence 和 Word 无法做到这一点。它们的日志可以由管理员修改。Git 的历史记录<strong>无法被静默更改</strong>。</p><h3 id="提交签署：变更的数字签章">提交签署：变更的数字签章</h3><p>Git 有一个更强大的安全功能：<strong>提交签署</strong>。</p><p>每个人都获得一个<strong>个人证书</strong>（就像数字身份证）。当您保存变更时，Git 会用您的证书签署它。签名证明：<em>「这个变更来自我，我批准它。」</em></p><p><strong>现实类比：</strong></p><pre class="language-none"><code class="language-none">传统 Git 提交：  👤 「John 批准了此变更」  → 您相信系统正确记录了这个签署的 Git 提交：  👤 「John 批准了此变更」✍️ [数字签署]  → 密码学证明 John 批准了它  → John 的证书验证签名  → 没有 John 的私钥就无法伪造</code></pre><p><strong>为什么签署很重要：</strong></p><ul><li><strong>防止冒充</strong> — 没人能假装是您</li><li><strong>法律有效性</strong> — 签署的提交在法庭上有效（就像亲笔签名）</li><li><strong>供应链安全</strong> — 确切知道谁批准了每次变更</li><li><strong>合规性</strong> — 某些受监管产业需要</li></ul><p><strong>实际情况：</strong></p><pre class="language-none"><code class="language-none">✅ 已验证提交 abc123 由 John Doe (john@company.com)⚠️ 未验证提交 def456 由 unknown@example.com</code></pre><p>GitHub 和 GitLab 在签署的提交上显示绿色「Verified」徽章。如果有人试图伪造您的提交，签名将不匹配——立即暴露。</p><h3 id="Git-的差异">Git 的差异</h3><p><strong>纯 Markdown 文件（没有 Git）：</strong></p><ul><li>版本历史记录：仅文件时间戳</li><li>协作：覆盖冲突</li><li>离线访问：是（本地文件）</li><li>审核追踪：手动记录</li><li>回滚：「有人有旧版本吗？」</li><li>分布式：集中式文件服务器</li></ul><p><strong>使用 Git 的文档即代码：</strong></p><ul><li>版本历史记录：每次变更都被追踪，谁/何时/为什么</li><li>协作：分支、合并、解决冲突</li><li>离线访问：是（完整的 repo 克隆）</li><li>审核追踪：不可变的提交历史记录（密码学保护）</li><li>回滚：<code>git revert</code> — 即时恢复</li><li>分布式：每个克隆都是完整的备份</li></ul><h3 id="关键洞察">关键洞察</h3><p>Git 本质上是<strong>去中心化的</strong>。每个开发者都有文档存储库的完整副本。这意味着：</p><ul><li>没有单一故障点</li><li>离线运作（对沙盒/气隙环境至关重要）</li><li>无需登录即可阅读</li><li>没有厂商能挟持您的知识</li></ul><p>现在我们了解了基础，让我们探讨为什么这种架构在系统故障时很重要。</p><hr /><h2 id="3-操作手册测试：凌晨三点会发生什么？">3 操作手册测试：凌晨三点会发生什么？</h2><p>让我们用文档即代码重播我们的开场情境。</p><p><strong>凌晨三点。生产事故。无法访问互联网（沙盒环境）。</strong></p><h3 id="使用传统文档：">使用传统文档：</h3><pre class="language-none"><code class="language-none">工程师：「让我检查操作手册……」  ↓开启浏览器 → Confluence 登录 → 没有网络  ↓致电队友 → 「Wiki 密码是什么？」  ↓队友：「我想在 LastPass 里……」  ↓LastPass → 没有网络 → 无法同步  ↓[事故升级，同时寻找凭证]</code></pre><p><strong>解决时间：</strong> 45 分钟（包括 38 分钟寻找文档）</p><h3 id="使用文档即代码：">使用文档即代码：</h3><pre class="language-none"><code class="language-none">工程师：「让我检查操作手册……」  ↓开启终端机 → &#96;cd runbooks&#96; → 已经本地克隆  ↓&#96;grep &quot;database failover&quot; *.md&#96; → 即时搜索  ↓遵循程序 → 系统恢复  ↓提交事故笔记 → &#96;git commit -m &quot;Incident #2026-0329&quot;&#96;</code></pre><p><strong>解决时间：</strong> 7 分钟（全部用于修复问题）</p><div class="admonition question"><p class="admonition-title"><span class="mdi mdi-comment-question-outline admonition-icon"></span>🤔 为什么离线很重要？</p><div class="admonition-content"><p>您可能会想：<em>「我们一直有互联网。这不会发生在我们身上。」</em></p><p>考虑这些情境：</p><ul><li><strong>安全事故</strong> → 调查期间限制网络访问</li><li><strong>云端停机</strong> → 您的文档在云端……停机了</li><li><strong>气隙环境</strong> → 政府、金融、医疗保健沙盒</li><li><strong>旅行</strong> → 飞行模式、饭店 WiFi 不佳、国际漫游</li><li><strong>灾难恢复</strong> → 当一切都坏了，包括互联网</li></ul><p><strong>原则：</strong> 关键文档应该在您最需要的时候运作——而不是在条件理想时。</p></div></div><p>看到了 DaC 在压力下的表现，让我们检查使其值得采用的实际优势。</p><hr /><h2 id="4-为什么文档即代码获胜">4 为什么文档即代码获胜</h2><h3 id="1-导出为任何格式">1. 导出为任何格式</h3><p>您的利益相关者想要 Word？PDF？Confluence？没问题。</p><p><strong>自动化运作方式：</strong></p><pre class="language-none"><code class="language-none">您将变更保存到 Git        ↓自动化检测变更        ↓构建 PDF 版本        ↓构建 Word 版本        ↓更新网站        ↓（可选）同步到 Confluence        ↓完成 — 所有格式自动更新</code></pre><p><strong>这取代了什么：</strong></p><table><thead><tr><th>手动流程</th><th>自动化流程</th></tr></thead><tbody><tr><td>开启文档 → 导出为 PDF → 保存</td><td>一次保存触发所有操作</td></tr><tr><td>开启文档 → 导出为 Word → 电子邮件</td><td>文件自动生成和保存</td></tr><tr><td>登入 Confluence → 复制/贴上 → 发布</td><td>同步在后台发生</td></tr><tr><td>每次变更加以重复</td><td>一致执行，不会遗忘</td></tr></tbody></table><p><strong>实际自动化配置如下：</strong></p><pre class="language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># CI/CD 管线构建多种格式</span><span class="token key atrule">on</span><span class="token punctuation">:</span>  <span class="token key atrule">push</span><span class="token punctuation">:</span>    <span class="token key atrule">branches</span><span class="token punctuation">:</span> <span class="token punctuation">[</span>main<span class="token punctuation">]</span><span class="token key atrule">jobs</span><span class="token punctuation">:</span>  <span class="token key atrule">build-docs</span><span class="token punctuation">:</span>    <span class="token key atrule">runs-on</span><span class="token punctuation">:</span> ubuntu<span class="token punctuation">-</span>latest    <span class="token key atrule">steps</span><span class="token punctuation">:</span>      <span class="token punctuation">-</span> <span class="token key atrule">uses</span><span class="token punctuation">:</span> actions/checkout@v4      <span class="token comment"># 转换为 PDF</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Build PDF        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//pandoc/core        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> runbook.md <span class="token punctuation">-</span>o runbook.pdf      <span class="token comment"># 转换为 Word（给坚持的利益相关者）</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Build Word        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//pandoc/core        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> runbook.md <span class="token punctuation">-</span>o runbook.docx      <span class="token comment"># 同步到 Confluence（给还没准备好放弃的团队）</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Sync to Confluence        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//confluence<span class="token punctuation">-</span>publisher        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>source ./docs <span class="token punctuation">-</span><span class="token punctuation">-</span>space OPS</code></pre><p><strong>什么是 CI/CD？</strong> 这是每当您的文档变更时执行的自动化。把它想象成机器人助理：您将变更保存到 Git，它会自动构建 PDF、Word 文档和网站——无需手动步骤。</p><p><strong>魔法：</strong> 写一次（Markdown），发布到各处（PDF、Word、HTML、Confluence）。</p><table><thead><tr><th>格式</th><th>使用案例</th></tr></thead><tbody><tr><td><strong>Markdown（源代码）</strong></td><td>作者、版本控制、差异比较</td></tr><tr><td><strong>PDF</strong></td><td>正式报告、合规提交</td></tr><tr><td><strong>Word</strong></td><td>需要追踪修订的利益相关者</td></tr><tr><td><strong>HTML</strong></td><td>内部文档网站</td></tr><tr><td><strong>Confluence</strong></td><td>仍在迁移的团队（临时桥梁）</td></tr></tbody></table><hr /><h3 id="2-可扫描的安全性">2. 可扫描的安全性</h3><p>传统文档是<strong>安全盲点</strong>：</p><pre class="language-none"><code class="language-none">🔍 安全团队：「我们可以扫描 Word 文档是否有机密吗？」👤 IT 管理员：「它们在文件服务器上。我们需要……」🔍 安全团队：「那 Confluence 呢？」👤 IT 管理员：「那是 SaaS。您需要 API 访问和……」🔍 安全团队：*叹息*</code></pre><p>文档即代码是<strong>安全透明</strong>的：</p><p><strong>范例：扫描意外机密</strong></p><pre class="language-bash" data-language="bash"><code class="language-bash"><span class="token comment"># 扫描所有文档是否泄漏的密码或 API 密钥</span>$ gitleaks detect <span class="token parameter variable">--source</span> ./docs --report-path secrets.json<span class="token comment"># 搜索常见的敏感模式</span>$ <span class="token function">grep</span> <span class="token parameter variable">-r</span> <span class="token string">"password\|api_key\|secret"</span> ./docs/*.md<span class="token comment"># 在保存变更前自动执行检查</span>$ pre-commit run --all-files</code></pre><p><strong>这些命令的作用：</strong> 第一个命令执行安全性扫描器，寻找密码、API 密钥和令牌。第二个搜索常见的敏感字词。第三个在每次有人尝试保存变更时自动执行——在保存之前封锁任何可疑内容。</p><p><strong>会抓到什么：</strong></p><table><thead><tr><th>风险</th><th>检测方法</th></tr></thead><tbody><tr><td>意外 API 密钥</td><td>CI 管线中的正则表达式模式</td></tr><tr><td>硬编码密码</td><td>机密扫描工具（gitleaks、truffleHog）</td></tr><tr><td>过期凭证</td><td>自动轮换警报</td></tr><tr><td>合规违规</td><td>策略即代码检查</td></tr></tbody></table><div class="admonition tip"><p class="admonition-title"><span class="mdi mdi-lightbulb-on-outline admonition-icon"></span>💡 合规红利</p><div class="admonition-content"><p>审核员喜欢文档即代码，因为：</p><ul><li><strong>不可变的历史记录</strong> → 谁变更了什么，何时（密码学保护，像区块链）</li><li><strong>防篡改</strong> → 被篡改的历史记录会立即被检测到</li><li><strong>批准工作流程</strong> → 拉取请求需要审查</li><li><strong>自动检查</strong> → 策略违规会封锁合并</li><li><strong>轻松导出</strong> → 按需生成审核报告</li><li><strong>不可否認性</strong> → 无法否认您所做的变更</li></ul></div></div><hr /><h3 id="3-2029-Confluence-迁移">3. 2029 Confluence 迁移</h3><p>让我们面对现实：<strong>Confluence Data Center 将于 2029 年 3 月 28 日变为只读</strong>。</p><p><strong>您的选项：</strong></p><table><thead><tr><th>选项</th><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td><strong>迁移到 Confluence Cloud</strong></td><td>熟悉的 UI，最少重新培训</td><td>☠️ 厂商锁定加深，SaaS 定价，数据主权问题</td></tr><tr><td><strong>迁移到文档即代码</strong></td><td>拥有您的数据，离线访问，无厂商风险</td><td>非技术用户的学习曲线</td></tr><tr><td><strong>迁移到另一个 Wiki</strong></td><td>类似的 UX</td><td>相同的基本问题（登录、搜索、锁定）</td></tr><tr><td><strong>协商延长维护</strong></td><td>争取更多时间</td><td>临时修复，昂贵，仍需最终迁移</td></tr></tbody></table><p><strong>Confluence Cloud 的实际成本：</strong></p><pre class="language-none"><code class="language-none">企业（1000 用户）：  - Confluence Cloud：~$120,000&#x2F;年  - 必要附加组件：~$30,000&#x2F;年  - 迁移服务：~$50,000（一次性）  - 培训：~$20,000  第 1 年总计：~$220,000  第 3 年总计：~$410,000</code></pre><p><strong>文档即代码成本：</strong></p><pre class="language-none"><code class="language-none">企业（1000 用户）：  - Git 托管（GitHub&#x2F;GitLab）：~$20,000&#x2F;年（通常已支付）  - 静态网站生成器：$0（开源）  - CI&#x2F;CD：$0-$10,000&#x2F;年（通常包含）  - 培训：~$20,000（一次性）  第 1 年总计：~$50,000  第 3 年总计：~$80,000</code></pre><p><strong>3 年节省：~$330,000</strong>（而且您拥有您的数据）</p><p>财务和战略优势明确后，让我们检查文档即代码面临真正挑战的地方。</p><hr /><h2 id="5-AI-优势：为什么-DaC-是-AI-助理的完美选择">5 AI 优势：为什么 DaC 是 AI 助理的完美选择</h2><p>这是转折：<strong>文档即代码是您能选择的最 AI 友好的文件格式。</strong></p><h3 id="低成本，高影响">低成本，高影响</h3><p><strong>代币经济：</strong></p><p>AI 助理按代币收费（大约 1 代币 = 1 个单词）。比较：</p><table><thead><tr><th>格式</th><th>代币数量</th><th>处理成本</th></tr></thead><tbody><tr><td><strong>Markdown 文件</strong></td><td>~500 代币</td><td>$0.001</td></tr><tr><td><strong>Word 文档</strong>（具有格式 XML）</td><td>~5,000 代币</td><td>$0.010</td></tr><tr><td><strong>Confluence 页面</strong>（HTML + 元数据）</td><td>~3,000 代币</td><td>$0.006</td></tr><tr><td><strong>PDF</strong>（二进制，需要提取）</td><td>可变 + 提取成本</td><td>$$$</td></tr></tbody></table><p><strong>为什么 Markdown 获胜：</strong></p><pre class="language-none"><code class="language-none">Markdown：     &quot;# Runbook：Database Failover&quot; → 干净，最少代币Word DOCX：    &quot;&lt;w:document&gt;&lt;w:p&gt;&lt;w:r&gt;&lt;w:t&gt;Runbook...&lt;&#x2F;w:t&gt;&lt;&#x2F;w:r&gt;&lt;&#x2F;w:p&gt;...&quot; → XML 膨胀Confluence：   &quot;&lt;div class&#x3D;&#39;content&#39;&gt;&lt;h1&gt;Runbook&lt;&#x2F;h1&gt;&lt;span data-...&gt;...&quot; → HTML 噪声</code></pre><p><strong>计算：</strong> 使用 AI 更新 100 个文档页面：</p><ul><li><strong>Markdown：</strong> API 成本约 $0.10</li><li><strong>Word/Confluence：</strong> API 成本约 $0.60-1.00</li><li><strong>节省：</strong> AI 处理成本降低 80-90%</li></ul><hr /><h3 id="命令行-AI-代理-完美搭配">命令行 + AI 代理 = 完美搭配</h3><p><strong>AI 代理喜欢命令行工具。</strong> 原因如下：</p><pre class="language-none"><code class="language-none">🤖 AI 代理：「我将更新 PostgreSQL 16 的操作手册」步骤 1：克隆存储库          → &#96;git clone ...&#96;步骤 2：寻找相关文件        → &#96;grep -r &quot;PostgreSQL&quot; docs&#x2F;&#96;步骤 3：阅读当前内容        → &#96;cat docs&#x2F;runbooks&#x2F;db-failover.md&#96;步骤 4：产生更新的内容      → （AI 写新版本）步骤 5：保存变更            → &#96;git add &amp;&amp; git commit&#96;步骤 6：建立拉取请求         → &#96;gh pr create ...&#96;✅ 30 秒内完成</code></pre><p><strong>为什么这有效：</strong></p><table><thead><tr><th>工具类型</th><th>AI 整合</th><th>范例</th></tr></thead><tbody><tr><td><strong>Git 命令</strong></td><td>本机文字 I/O</td><td>AI 透过 CLI 读取/写入</td></tr><tr><td><strong>grep/sed/awk</strong></td><td>简单转换</td><td>AI 寻找和更新模式</td></tr><tr><td><strong>pandoc</strong></td><td>格式转换</td><td>AI 导出为任何格式</td></tr><tr><td><strong>静态网站生成器</strong></td><td>构建自动化</td><td>AI 本机预览变更</td></tr></tbody></table><p><strong>与 Confluence 对比：</strong></p><pre class="language-none"><code class="language-none">🤖 AI 代理：「我将更新 Confluence 页面……」步骤 1：透过 OAuth 验证     → 代币交换，API 密钥步骤 2：透过 REST API 取得页面   → HTTP 请求，速率限制步骤 3：解析 HTML 内容        → 移除标签，处理编码步骤 4：产生更新的内容      → （AI 写新版本）步骤 5：转换回 HTML          → 重新新增格式，宏步骤 6：透过 API 发布         → HTTP POST，处理冲突❌ 复杂度增加 10 倍，速度慢 5 倍，API 速率限制</code></pre><hr /><h3 id="视觉图表：Mermaid-图表">视觉图表：Mermaid 图表</h3><p><strong>Markdown 现在本机支持图表：</strong></p><pre class="language-markdown" data-language="markdown"><code class="language-markdown">flowchart LR    A[侦测到事故] --> B&#123;严重性？&#125;    B -->|关键| C[呼叫值班人员]    B -->|低| D[记录工单]    C --> E[启动操作手册]    D --> F[监控]</code></pre><p><strong>渲染为：</strong></p><pre class="language-MERMAID_BASE64_627" data-language="MERMAID_BASE64_627"><code class="language-MERMAID_BASE64_627">Zmxvd2NoYXJ0IExSCiAgICBBW+S+pua1i+WIsOS6i+aVhV0gLS0+IEJ75Lil6YeN5oCn77yffQogICAgQiAtLT585YWz6ZSufCBDW+WRvOWPq+WAvOePreS6uuWRmF0KICAgIEIgLS0+fOS9jnwgRFvorrDlvZXlt6XljZVdCiAgICBDIC0tPiBFW+WQr+WKqOaTjeS9nOaJi+WGjF0KICAgIEQgLS0+IEZb55uR5o6nXQ&#x3D;&#x3D;</code></pre><p><strong>AI + Mermaid = 即时图表：</strong></p><pre class="language-none"><code class="language-none">👤 用户：「建立我们部署流程的流程图」🤖 AI：*产生 Mermaid 代码*✅ 结果：专业图表，无需设计技能</code></pre><p><strong>支持的图表类型：</strong></p><table><thead><tr><th>类型</th><th>使用案例</th></tr></thead><tbody><tr><td>流程图</td><td>流程文档</td></tr><tr><td>序列图</td><td>API 互动</td></tr><tr><td>甘特图</td><td>项目时程</td></tr><tr><td>类别图</td><td>系统架构</td></tr><tr><td>心智图</td><td>头脑风暴</td></tr></tbody></table><hr /><h3 id="学习曲线正在变平">学习曲线正在变平</h3><p><strong>当时（2020）：</strong></p><pre class="language-none"><code class="language-none">👤 非技术用户：「什么是 Markdown？」🔧 工程师：「就像是带有符号的纯文字……」👤 非技术用户：「我在哪里编辑？」🔧 工程师：「您需要文字编辑器，或者也许……」😓 摩擦：高</code></pre><p><strong>现在（2026）：</strong></p><pre class="language-none"><code class="language-none">👤 非技术用户：「我如何编辑？」选项 1：GitHub Web UI（所见即所得模式）  - 点击编辑 → 看到格式化视图 → 保存选项 2：Notion（导出为 Markdown）  - 视觉化写作 → 导出为 .md选项 3：Google Docs（具有 Markdown 转换器）  - 在 Docs 中写作 → 自动转换为 .md选项 4：Microsoft Word（保存为 Markdown）  - 内建支持😓 摩擦：低且正在减少</code></pre><p><strong>趋势：</strong> 所见即所得编辑器正在<strong>新增 Markdown 支持</strong>，而不是取代它。</p><table><thead><tr><th>平台</th><th>Markdown 支持</th></tr></thead><tbody><tr><td>GitHub/GitLab</td><td>✅ 具有预览的本机编辑器</td></tr><tr><td>Notion</td><td>✅ 导入/导出 Markdown</td></tr><tr><td>Obsidian</td><td>✅ Markdown 优先的知识库</td></tr><tr><td>Microsoft Word</td><td>✅ 保存为 Markdown（2024+）</td></tr><tr><td>Google Docs</td><td>✅ Markdown 附加组件</td></tr><tr><td>Slack</td><td>✅ Markdown 格式化</td></tr><tr><td>Discord</td><td>✅ Markdown 格式化</td></tr></tbody></table><hr /><h3 id="AI-代理普及-Markdown">AI 代理普及 Markdown</h3><p><strong>现实：</strong> 非技术用户不再需要学习 Git 命令。</p><pre class="language-none"><code class="language-none">👤 营销经理：「更新首页文案」2020 工作流程：  - 学习 Git 基础  - 克隆存储库  - 在文字编辑器中编辑文件  - 执行 Git 命令  - 开启拉取请求  - 等待审查2026 工作流程：  - 告诉 AI 代理：「更新首页文案为 X」  - AI 建立分支，编辑文件，开启 PR  - 审查通知到达 Slack  - 点击「批准」→ 完成</code></pre><p><strong>弥合差距的 AI 工具：</strong></p><table><thead><tr><th>工具</th><th>作用</th></tr></thead><tbody><tr><td><strong>GitHub Copilot</strong></td><td>建议编辑，解释 Git 命令</td></tr><tr><td><strong>Cursor</strong></td><td>具有 Git 整合的 AI 驱动编辑器</td></tr><tr><td><strong>Claude Code</strong></td><td>自然语言 → Git 操作</td></tr><tr><td><strong>Warp</strong></td><td>解释命令的 AI 终端机</td></tr></tbody></table><p><strong>结论：</strong> Markdown + Git 曾经是「开发者技能」。随着 AI 代理，它正在成为<strong>通用技能</strong>——就像打字一样。</p><hr /><h2 id="6-残酷的真相：DaC-的挣扎之处">6 残酷的真相：DaC 的挣扎之处</h2><p>文档即代码并不完美。以下是它真正不足的地方：</p><h3 id="挑战-1：非技术协作">挑战 1：非技术协作</h3><p><strong>问题：</strong></p><pre class="language-none"><code class="language-none">👤 营销经理：「我如何建议编辑？」🔧 工程师：「Fork 存储库，建立分支，提交，开启 PR……」👤 营销经理：*默默地发送 Slack 讯息*</code></pre><p><strong>现实检查：</strong> Git 有学习曲线。对于没有开发经验的团队，工作流程感觉陌生。</p><p><strong>!!! success &quot;✅ AI 的一线希望」</strong><br />AI 代理正在迅速减少这种摩擦。像<strong>Claude Code</strong>、<strong>GitHub Copilot</strong>和<strong>Cursor</strong>这样的工具现在可以：<br />- 从自然语言执行 Git 命令（「建立分支并更新操作手册」）<br />- 用简单的英语解释每个命令的作用<br />- 自动产生提交讯息和拉取请求描述</p><p><strong>差距比预期更快地缩小。</strong> 2023 年需要 Git 培训的内容，2026 年可以透过聊天完成。</p><p><strong>缓解策略：</strong></p><table><thead><tr><th>方法</th><th>如何帮助</th><th>权衡</th></tr></thead><tbody><tr><td><strong>GitHub/GitLab Web UI</strong></td><td>在浏览器中编辑文件，无需 Git 知识</td><td>限于简单的变更</td></tr><tr><td><strong>VS Code + GitLens</strong></td><td>视觉化 Git 工具，点按提交</td><td>仍需安装工具</td></tr><tr><td><strong>指定的文档负责人</strong></td><td>技术作家管理 Git，主题专家提供内容</td><td>文档负责人的瓶颈</td></tr><tr><td><strong>混合工作流程</strong></td><td>接受 Word/Google Docs，转换为 Markdown</td><td>额外的转换步骤</td></tr></tbody></table><hr /><h3 id="挑战-2：没有内嵌评论">挑战 2：没有内嵌评论</h3><p><strong>问题：</strong></p><p>Confluence 和 Google Docs 擅长内嵌评论：</p><pre class="language-none"><code class="language-none">📄 Confluence 页面：  「重新启动数据库服务」  └─ 💬 评论：「哪个服务？postgresql.service 还是 mysqld.service？」  └─ 💬 评论：「这个步骤在 staging 中对我失败了」  └─ 💬 评论：「在 PR #452 中更新了命令」</code></pre><p>Markdown 文件没有本机的内嵌评论。</p><p><strong>解决方案：</strong></p><table><thead><tr><th>方法</th><th>运作方式</th><th>限制</th></tr></thead><tbody><tr><td><strong>拉取请求评论</strong></td><td>在审查期间评论特定行</td><td>仅在 PR 期间可见，最终文件中不可见</td></tr><tr><td><strong>GitHub/GitLab Issues</strong></td><td>将 issue 链接到文档部分</td><td>需要在系统之间导航</td></tr><tr><td><strong>HTML 注释</strong></td><td>在 Markdown 中新增评论区块</td><td>弄乱源代码，未渲染</td></tr><tr><td><strong>外部工具</strong></td><td>像 GitBook、ReadMe 这样的工具新增评论</td><td>重新引入厂商依赖</td></tr></tbody></table><hr /><h3 id="挑战-3：视觉协作">挑战 3：视觉协作</h3><p><strong>问题：</strong></p><p>一些团队在视觉协作中茁壮成长：</p><pre class="language-none"><code class="language-none">🎨 Google Docs：  - 高亮文字 → 新增评论 → 分配给人员  - 看到其他人即时编辑的游标  - 建议模式 → 视觉化接受&#x2F;拒绝变更</code></pre><p>Git 本质上是<strong>异步的</strong>。即时协作不是它的强项。</p><p><strong>什么时候重要：</strong></p><table><thead><tr><th>情境</th><th>DaC 适合度</th><th>更好的替代方案</th></tr></thead><tbody><tr><td>技术操作手册</td><td>✅ 优秀</td><td>—</td></tr><tr><td>API 文档</td><td>✅ 优秀</td><td>—</td></tr><tr><td>策略文档</td><td>⚠️ 中等</td><td>Google Docs（草稿）→ DaC（最终）</td></tr><tr><td>营销内容</td><td>❌ 差</td><td>Google Docs、Notion</td></tr><tr><td>头脑风暴会议</td><td>❌ 差</td><td>白板、Miro、FigJam</td></tr></tbody></table><hr /><h3 id="挑战-4：「我在哪里编辑？」问题">挑战 4：「我在哪里编辑？」问题</h3><p><strong>问题：</strong></p><p>新贡献者面临摩擦：</p><pre class="language-none"><code class="language-none">👤 新团队成员：「我发现操作手册中有错字。我如何修复它？」传统：  - 点击「编辑」按钮 → 输入 → 保存 → 完成文档即代码：  - 克隆存储库（或导航到网页……）  - 建立分支（或在网页上编辑）  - 进行变更  - 写提交讯息  - 建立拉取请求  - 等待审查  - 合并（或请求合并）</code></pre><p><strong>摩擦税：</strong> 与 wiki 式编辑相比，每次编辑需要约 5-10 个额外步骤。</p><p><strong>缓解：</strong></p><p><strong>范例：贡献者的简单指南</strong></p><pre class="language-markdown" data-language="markdown"><code class="language-markdown"><span class="token title important"><span class="token punctuation">#</span> .github/CONTRIBUTING.md</span><span class="token title important"><span class="token punctuation">##</span> 如何更新文档</span><span class="token title important"><span class="token punctuation">###</span> 快速修正（错字，小变更）</span><span class="token list punctuation">1.</span> 在 GitHub 上导航到文件<span class="token list punctuation">2.</span> 点击 ✏️ 铅笔图示<span class="token list punctuation">3.</span> 进行您的变更<span class="token list punctuation">4.</span> 写简短描述<span class="token list punctuation">5.</span> 点击「建议变更」<span class="token list punctuation">6.</span> 完成！我们将审查并合并。<span class="token title important"><span class="token punctuation">###</span> 较大的变更</span><span class="token list punctuation">1.</span> Fork 存储库<span class="token list punctuation">2.</span> 建立分支：<span class="token code-snippet code keyword">`git checkout -b fix/my-change`</span><span class="token list punctuation">3.</span> 编辑文件<span class="token list punctuation">4.</span> 提交：<span class="token code-snippet code keyword">`git commit -m "fix：描述您的变更"`</span><span class="token list punctuation">5.</span> 推送：<span class="token code-snippet code keyword">`git push origin fix/my-change`</span><span class="token list punctuation">6.</span> 开启拉取请求</code></pre><p>明确的指导显著减少摩擦。</p><p>承认挑战后，让我们探讨采用文档即代码的团队的实际策略。</p><hr /><h2 id="7-让-DaC-运作：实用指南">7 让 DaC 运作：实用指南</h2><h3 id="从小处开始，快速获胜">从小处开始，快速获胜</h3><p><strong>第 1-2 周：试点项目</strong></p><pre class="language-none"><code class="language-none">📁 docs&#x2F;  └── runbooks&#x2F;      ├── incident-response.md      ├── database-failover.md      └── deployment-procedure.md</code></pre><p>选择<strong>一个高价值、技术受众</strong>（例如，值班工程师）。获得他们的支持。让他们亲身体验离线优势。</p><hr /><h3 id="建立桥梁，而不是墙">建立桥梁，而不是墙</h3><p><strong>不要：</strong> 「Confluence 现在对我们来说死了。」</p><p><strong>要：</strong> 「让我们在迁移期间同时执行两者。」</p><p><strong>范例：在过渡期间自动同步到 Confluence</strong></p><pre class="language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># CI/CD 同步 DaC → Confluence（临时）</span><span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Publish to Confluence  <span class="token key atrule">if</span><span class="token punctuation">:</span> github.ref == 'refs/heads/main'  <span class="token key atrule">uses</span><span class="token punctuation">:</span> confluence<span class="token punctuation">-</span>publisher@v1  <span class="token key atrule">with</span><span class="token punctuation">:</span>    <span class="token key atrule">space</span><span class="token punctuation">:</span> OPS    <span class="token key atrule">parent</span><span class="token punctuation">:</span> <span class="token string">"Operations Runbooks"</span></code></pre><p><strong>这的作用：</strong> 每当文档更新时，它会自动发布副本到 Confluence。这让团队可以继续使用 Confluence，同时逐渐采用文档即代码——没有突然的中断。</p><p>这给利益相关者时间适应，同时证明 DaC 的价值。</p><hr /><h3 id="投资工具">投资工具</h3><p><strong>基本堆叠：</strong></p><table><thead><tr><th>工具</th><th>目的</th><th>成本</th></tr></thead><tbody><tr><td><strong>VS Code + Markdown All in One</strong></td><td>作者体验</td><td>免费</td></tr><tr><td><strong>MkDocs + Material Theme</strong></td><td>静态网站生成</td><td>免费</td></tr><tr><td><strong>GitHub Actions / GitLab CI</strong></td><td>构建与部署管线</td><td>免费-$</td></tr><tr><td><strong>pandoc</strong></td><td>格式转换（PDF、Word）</td><td>免费</td></tr><tr><td><strong>gitleaks</strong></td><td>机密扫描</td><td>免费</td></tr></tbody></table><p><strong>可有可无：</strong></p><table><thead><tr><th>工具</th><th>目的</th><th>成本</th></tr></thead><tbody><tr><td><strong>GitLens</strong></td><td>视觉化 Git 历史记录</td><td>免费-$</td></tr><tr><td><strong>Markdownlint</strong></td><td>样式执行</td><td>免费</td></tr><tr><td><strong>Vale</strong></td><td>语法和样式检查</td><td>免费</td></tr></tbody></table><hr /><h3 id="定义工作流程">定义工作流程</h3><p><strong>给工程师：</strong></p><p><strong>实际情况：</strong></p><pre class="language-bash" data-language="bash"><code class="language-bash"><span class="token comment"># 1. 为您的变更建立新分支</span><span class="token function">git</span> checkout <span class="token parameter variable">-b</span> docs/update-failover-procedure<span class="token comment"># 2. 开启并编辑文档文件</span>code docs/runbooks/database-failover.md<span class="token comment"># 3. 在浏览器中预览外观</span>mkdocs serve  <span class="token comment"># 开启在 http://localhost:8000</span><span class="token comment"># 4. 将您的变更保存到版本控制</span><span class="token function">git</span> <span class="token function">add</span> docs/runbooks/database-failover.md<span class="token function">git</span> commit <span class="token parameter variable">-m</span> <span class="token string">"docs：更新 PostgreSQL 16 的故障转移步骤"</span><span class="token function">git</span> push origin docs/update-failover-procedure<span class="token comment"># 5. 请求团队审查</span>gh <span class="token function">pr</span> create <span class="token parameter variable">--title</span> <span class="token string">"docs：更新故障转移步骤"</span> <span class="token parameter variable">--body</span> <span class="token string">"更新 PG16 兼容性"</span></code></pre><p><strong>翻译：</strong> 每个命令做一件事——建立工作区、编辑文件、预览、保存它，并请求队友审查。bash 符号如 <code>#</code> 只是解释每个步骤作用的注释。</p><p><strong>给非工程师：</strong></p><pre class="language-none"><code class="language-none">1. 在 GitHub&#x2F;GitLab 上导航到文件2. 点击「编辑」（铅笔图示）3. 进行您的变更4. 写下变更的简短描述5. 点击「建议变更」6. 团队成员将审查并合并</code></pre><hr /><h3 id="衡量成功">衡量成功</h3><p>追踪这些指标：</p><table><thead><tr><th>指标</th><th>DaC 之前</th><th>DaC 之后</th><th>目标</th></tr></thead><tbody><tr><td>寻找操作手册的时间</td><td>5-10 分钟</td><td>&lt; 1 分钟</td><td>&lt; 30 秒</td></tr><tr><td>文档新颖度</td><td>数月过时</td><td>每次事故更新</td><td>同日</td></tr><tr><td>离线可访问性</td><td>❌ 否</td><td>✅ 是</td><td>✅ 是</td></tr><tr><td>安全扫描覆盖率</td><td>0%</td><td>100%</td><td>100%</td></tr><tr><td>贡献者数量</td><td>3-5 个「负责人」</td><td>10-15 个团队成员</td><td>20+</td></tr></tbody></table><p>现在我们有了实际实施策略，让我们检查不同类型组织的战略影响。</p><hr /><h2 id="8-战略视角：谁应该采用-DaC？">8 战略视角：谁应该采用 DaC？</h2><h3 id="完美适合-✅">完美适合 ✅</h3><table><thead><tr><th>组织类型</th><th>为什么</th></tr></thead><tbody><tr><td><strong>DevOps/SRE 团队</strong></td><td>已经使用 Git，重视离线访问</td></tr><tr><td><strong>安全意识强的</strong></td><td>需要审核追踪、机密扫描</td></tr><tr><td><strong>受监管的产业</strong></td><td>合规需要版本控制</td></tr><tr><td><strong>分布式团队</strong></td><td>跨时区异步协作</td></tr><tr><td><strong>气隙环境</strong></td><td>离线访问是强制性的</td></tr></tbody></table><h3 id="需要培训的良好适合-⚠️">需要培训的良好适合 ⚠️</h3><table><thead><tr><th>组织类型</th><th>考虑因素</th></tr></thead><tbody><tr><td><strong>传统 IT 营运</strong></td><td>投资 Git 培训，从试点团队开始</td></tr><tr><td><strong>混合技术/非技术</strong></td><td>混合工作流程（Google Docs → DaC 转换）</td></tr><tr><td><strong>Confluence 重度用户</strong></td><td>在迁移期间并行执行</td></tr></tbody></table><h3 id="不适合-❌">不适合 ❌</h3><table><thead><tr><th>组织类型</th><th>为什么</th></tr></thead><tbody><tr><td><strong>营销优先文档</strong></td><td>视觉协作是核心需求</td></tr><tr><td><strong>没有 Git 经验 + 没有培训预算</strong></td><td>摩擦将扼杀采用</td></tr><tr><td><strong>已长期承诺 SaaS Wiki</strong></td><td>迁移成本可能不合理</td></tr></tbody></table><hr /><h2 id="总结：文档的十字路口">总结：文档的十字路口</h2><pre class="language-MERMAID_BASE64_628" data-language="MERMAID_BASE64_628"><code class="language-MERMAID_BASE64_628">Zmxvd2NoYXJ0IExSCiAgICBBW+S7iuaXpeaWh+aho10gLS0+IEJ76YCJ5oup5oKo55qE6Lev5b6EfQogICAgQiAtLT4gQ1tDb25mbHVlbmNlIENsb3VkIDIwMjldCiAgICBCIC0tPiBEW+aWh+aho+WNs+S7o+eggV0KICAgIEIgLS0+IEVb546w54q2XQoKICAgIEMgLS0+IEZbU2FhUyDkvp3otZY8YnIvPuaMgee7reaIkOacrDxici8+6ZyA6KaB55m75b2VXQogICAgRCAtLT4gR1vmi6XmnInmgqjnmoTmlbDmja48YnIvPuemu+e6v+iuv+mXrjxici8+5a6J5YWo5omr5o+PXQogICAgRSAtLT4gSFvmioDmnK&#x2F;lgLrliqE8YnIvPuefpeivhua1geWksTxici8+MjAyOSDljbHmnLpdCgogICAgc3R5bGUgRCBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEMgZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAwCiAgICBzdHlsZSBFIGZpbGw6I2ZmZWJlZSxzdHJva2U6I2M2MjgyOA&#x3D;&#x3D;</code></pre><p><strong>选择：</strong></p><table><thead><tr><th>路径</th><th>2026</th><th>2027</th><th>2028</th><th>2029</th></tr></thead><tbody><tr><td><strong>文档即代码</strong></td><td>试点与学习</td><td>扩展采用</td><td>成熟工作流程</td><td>竞争优势</td></tr><tr><td><strong>Confluence Cloud</strong></td><td>迁移</td><td>定价增加</td><td>依赖加深</td><td>锁定</td></tr><tr><td><strong>现状</strong></td><td>舒适</td><td>增长痛苦</td><td>紧急问题</td><td>危机模式</td></tr></tbody></table><hr /><p><strong>结论：</strong></p><p>文档即代码不是关于 Markdown。是关于<strong>所有权</strong>。</p><p>当您的文档存放在 Git 中：</p><ul><li>📖 <strong>您拥有数据</strong> — 没有厂商能挟持它</li><li>🔓 <strong>离线运作</strong> — 关键时刻网络故障时</li><li>🔍 <strong>可扫描</strong> — 内建安全和合规</li><li>📦 <strong>可导出到任何地方</strong> — PDF、Word、Confluence（如果您必须）</li><li>📜 <strong>有历史记录</strong> — 每次变更被追踪、可还原、可审核</li><li>🔒 <strong>防篡改</strong> — 密码学保护，像区块链</li><li>🤖 <strong>AI 就绪</strong> — 最低代币成本，命令行友好，Mermaid 图表</li></ul><p>挑战是真实的——非技术协作、内嵌评论、视觉工作流程。但这些是<strong>可解决的问题</strong>，不是根本缺陷。</p><p>而在 2029 年，当 Confluence 本地版终止，您的合规团队问*「我们的文档在哪里？」*——您会想要一个不涉及恐慌迁移的答案。</p><p><strong>从小处开始。选择一个操作手册。克隆一个存储库。亲身体验离线优势。</strong></p><p>因为种树最好的时间是 20 年前。第二好的时间是在厂商关闭您的服务器之前。</p><hr /><h2 id="进一步阅读">进一步阅读</h2><ul><li><a href="https://docs.github.com/">GitHub Docs：Documentation as Code</a></li><li><a href="https://www.atlassian.com/licensing/data-center-end-of-life#data-center-eol-general-questions">Atlassian Confluence End-of-Life Announcement</a></li><li><a href="https://github.com/gitleaks/gitleaks">gitleaks：Secret Scanning Tool</a></li><li><a href="https://mermaid.js.org/">Mermaid：Diagrams and Flowcharts in Markdown</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Confluence 本地版将于 2029 年终止。Word 文档藏在没人找得到的共享硬盘。您的操作手册被困在付费墙和登录画面后。这就是为什么「文档即代码」（基于 Git 和 Markdown）是您的逃生路线，以及为什么时间正在倒数。</summary>
    
    
    
    <category term="Misc" scheme="https://neo01.com/categories/Misc/"/>
    
    
    <category term="GitOps" scheme="https://neo01.com/tags/GitOps/"/>
    
    <category term="Documentation" scheme="https://neo01.com/tags/Documentation/"/>
    
  </entry>
  
  <entry>
    <title>你的文件計時炸彈：為什麼 2029 年會迫使您改變（為什麼您會感謝我們）</title>
    <link href="https://neo01.com/zh-TW/2026/04/Document-as-Code-Why-Markdown-Git-Confluence-Word/"/>
    <id>https://neo01.com/zh-TW/2026/04/Document-as-Code-Why-Markdown-Git-Confluence-Word/</id>
    <published>2026-03-31T16:00:00.000Z</published>
    <updated>2026-04-01T16:32:07.880Z</updated>
    
    <content type="html"><![CDATA[<p>想像一下：凌晨三點。您的生產系統發生嚴重事故。值班工程師抓起筆記型電腦，打開操作手冊……然後遇到登入畫面。沒有網路。沒有 VPN。沒有存取權限。</p><p>與此同時，程序的「最終」版本存放在某人 2023 年桌上的 Word 文件中。Confluence 頁面？過期了。Wiki？沒人知道密碼。</p><p>這不是假設性的噩夢。對於數千個仍然將文件視為<strong>目的地</strong>而非<strong>交付物</strong>的團隊來說，這是家常便飯。</p><p>令人不安的事實是：<strong>您的文件策略是單一故障點。</strong> 而在 2029 年，當 Atlassian 永久關閉 Confluence 本機版時，這個故障將成為強制性的。</p><p>但有一條出路。它叫做<strong>文件即程式碼</strong>（Document as Code，DaC）。不，這不僅僅是「用 Markdown 寫作」。這是團隊思考知識方式的根本轉變。</p><hr /><h2 id="1-問題：文件墳墓">1 問題：文件墳墓</h2><p>讓我們點明房間裡的大象。</p><h3 id="Word-文件墓地">Word 文件墓地</h3><pre class="language-none"><code class="language-none">📁 共用硬碟&#x2F;  📁 Operations&#x2F;    📄 Runbook_FINAL.docx    📄 Runbook_FINAL_v2.docx    📄 Runbook_FINAL_v2_UPDATED.docx    📄 Runbook_FINAL_v3_ACTUAL_FINAL.docx    📄 Runbook_FINAL_v3_ACTUAL_FINAL_REALLY.docx</code></pre><p><strong>現實情況：</strong></p><ul><li>沒人知道哪個版本是權威的</li><li>變更需要「追蹤修訂」和電子郵件往來</li><li>搜尋？祝你好運</li><li>存取控制？要嘛每個人都有，要嘛完全沒有</li></ul><h3 id="Confluence-陷阱">Confluence 陷阱</h3><p>Confluence 承諾提供有組織、可搜尋的知識。但它帶來的是：</p><table><thead><tr><th>問題</th><th>影響</th></tr></thead><tbody><tr><td><strong>廠商鎖定</strong></td><td>您的知識存放在專有格式中</td></tr><tr><td><strong>需要登入</strong></td><td>沒有網路和憑證就無法閱讀文件</td></tr><tr><td><strong>搜尋……很樂觀</strong></td><td>找到正確的頁面感覺像考古</td></tr><tr><td><strong>2029 年終止</strong></td><td>本機版結束支援。要嘛 SaaS，要嘛什麼都沒有。</td></tr></tbody></table><div class="admonition warning"><p class="admonition-title"><span class="mdi mdi-alert-outline admonition-icon"></span>⚠️ 2029 年最後期限</p><div class="admonition-content"><p>Atlassian 已宣布<strong>Confluence Data Center 將於 2029 年 3 月 28 日終止</strong>。之後：</p><ul><li>授權過期，環境變為<strong>唯讀</strong></li><li>沒有安全性修補程式或錯誤修正</li><li>沒有技術支援</li><li><strong>您可以檢視資料但無法編輯或新增內容</strong></li><li>強烈不建議在連線至網際網路時以唯讀模式執行（無安全性更新）</li></ul><p><strong>時程表：</strong></p><ul><li><strong>2026 年 3 月 30 日</strong>：新客戶無法再購買 Data Center 訂閱</li><li><strong>2028 年 3 月 30 日</strong>：現有客戶無法再購買新訂閱或擴展</li><li><strong>2029 年 3 月 28 日</strong>：所有 Data Center 訂閱過期</li></ul><p>對於具有合規性、資料主權或氣隙環境的企業來說，這不是升級——這是最後通牒。延長維護可能需要例外協商，但需要直接與 Atlassian 談判。</p></div></div><h3 id="Wiki-的狂野西部">Wiki 的狂野西部</h3><p>Wiki 始於「每個人都可以編輯！」，變成了「沒人擁有這個。」</p><pre class="language-none"><code class="language-none">🌐 內部 Wiki  ├── 📄 入門指南（最後更新：2021 年）  ├── 📄 架構概覽（圖片損壞）  ├── 📄 值班程序（密碼：？？？）  └── 📄 [404 頁面未找到]</code></pre><p><strong>模式：</strong> 所有這三種方法都共享同一個致命缺陷——<strong>文件與工作分離</strong>。</p><hr /><h2 id="2-什麼是文件即程式碼？">2 什麼是文件即程式碼？</h2><p><strong>文件即程式碼</strong>將文件視為軟體：</p><table><thead><tr><th>軟體開發</th><th>文件即程式碼</th></tr></thead><tbody><tr><td>程式碼在 Git 中</td><td>文件在 Git 中</td></tr><tr><td>變更的拉取請求</td><td>編輯的拉取請求</td></tr><tr><td>程式碼審查</td><td>內容審查</td></tr><tr><td>CI/CD 管線</td><td>建構與部署管線</td></tr><tr><td>版本標籤</td><td>發布版本</td></tr><tr><td>回滾能力</td><td>完整歷史記錄，即時還原</td></tr></tbody></table><p>但 DaC 與「僅使用 Markdown」的不同之處在於：</p><h3 id="不僅僅是-Markdown。是-Git。">不僅僅是 Markdown。是 Git。</h3><pre class="language-none"><code class="language-none">❌ 「我們使用 Markdown」→ 共用硬碟上的檔案✅ 「我們使用文件即程式碼」→ 基於 Git 的工作流程，具有版本控制</code></pre><h3 id="什麼是-Git？（給非技術讀者）">什麼是 Git？（給非技術讀者）</h3><p><strong>Git</strong> 是一個隨時間追蹤檔案變更的工具。把它想像成<strong>文件的時光機</strong>。</p><p>每次您儲存變更時，Git 都會拍攝快照。您以後可以回到任何快照——昨天的版本、上週的，甚至一年前的。沒有什麼會丟失。</p><h3 id="為什麼-Git-被建立">為什麼 Git 被建立</h3><pre class="language-none"><code class="language-none">問題（Git 之前）：  👤 人員 A：「我正在編輯檔案！」  👤 人員 B：「我也是！」  → 兩人都儲存 → 一個人的變更為遺失 😱解決方案（使用 Git）：  👤 人員 A：「我在自己的副本上編輯」  👤 人員 B：「我也在自己的副本上編輯」  → 兩人都完成 → Git 安全地合併變更 ✅</code></pre><h3 id="現實類比">現實類比</h3><p><strong>Google 文件版本歷史</strong> 的工作原理類似——每次儲存都會記錄誰變更了什麼。Git 也這樣做，但有三個關鍵差異：</p><ol><li><strong>離線運作</strong> — 不需要網際網路</li><li><strong>電腦上的完整副本</strong> — 每個版本，永遠</li><li><strong>沒有廠商鎖定</strong> — 您的資料保持屬於您</li></ol><h3 id="這對您意味著什麼">這對您意味著什麼</h3><ul><li><strong>沒有網際網路？</strong> 沒問題。一切都是本機的。</li><li><strong>伺服器當機？</strong> 您有完整的備份。</li><li><strong>廠商消失？</strong> 您的資料是您的。</li><li><strong>犯了錯誤？</strong> 即時還原到任何時間點。</li></ul><h3 id="Git-很難學嗎？">Git 很難學嗎？</h3><p>對開發者來說，Git 是日常工作——他們已經知道了。</p><p>對非技術使用者來說，您不需要學習 Git 命令。現代工具（GitHub Web UI、AI 助理）處理複雜性。您只需編輯——Git 在背景運作。</p><h3 id="安全超能力：Git-就像文件的區塊鏈">安全超能力：Git 就像文件的區塊鏈</h3><p>這是強大的部分：<strong>Git 使用密碼學使歷史記錄防篡改</strong>。</p><p>每次變更都會獲得一個獨特的指紋（稱為「雜湊」）。這個指紋是根據以下內容計算的：</p><ul><li>您變更的內容</li><li>上次變更的指紋</li><li>誰進行了變更以及何時</li></ul><p>這創建了一個<strong>指紋鏈</strong>——就像區塊鏈一樣。如果有人試圖篡改歷史記錄（比如刪除誰批准變更的證據），指紋就不再匹配。篡改行為<strong>會立即被檢測到</strong>。</p><p>Confluence 和 Word 無法做到這一點。它們的日誌可以由管理員修改。Git 的歷史記錄<strong>無法被靜默更改</strong>。</p><h3 id="提交簽署：變更的數位簽章">提交簽署：變更的數位簽章</h3><p>Git 有一個更強大的安全功能：<strong>提交簽署</strong>。</p><p>每個人都獲得一個<strong>個人憑證</strong>（就像數位身分證）。当您儲存變更時，Git 會用您的憑證簽署它。簽名證明：<em>「這個變更來自我，我批准它。」</em></p><p><strong>現實類比：</strong></p><pre class="language-none"><code class="language-none">傳統 Git 提交：  👤 「John 批准了此變更」  → 您相信系統正確記錄了這個簽署的 Git 提交：  👤 「John 批准了此變更」✍️ [數位簽署]  → 密碼學證明 John 批准了它  → John 的憑證驗證簽名  → 沒有 John 的私鑰就無法偽造</code></pre><p><strong>為什麼簽署很重要：</strong></p><ul><li><strong>防止冒充</strong> — 沒人能假裝是您</li><li><strong>法律有效性</strong> — 簽署的提交在法庭上有效（就像親筆簽名）</li><li><strong>供應鏈安全</strong> — 確切知道誰批准了每次變更</li><li><strong>合規性</strong> — 某些受監管產業需要</li></ul><p><strong>實際情況：</strong></p><pre class="language-none"><code class="language-none">✅ 已驗證提交 abc123 由 John Doe (john@company.com)⚠️ 未驗證提交 def456 由 unknown@example.com</code></pre><p>GitHub 和 GitLab 在簽署的提交上顯示綠色「Verified」徽章。如果有人試圖偽造您的提交，簽名將不匹配——立即暴露。</p><h3 id="Git-的差異">Git 的差異</h3><p><strong>純 Markdown 檔案（沒有 Git）：</strong></p><ul><li>版本歷史記錄：僅檔案時間戳</li><li>協作：覆蓋衝突</li><li>離線存取：是（本機檔案）</li><li>審核追蹤：手動記錄</li><li>回滾：「有人有舊版本嗎？」</li><li>分散式：集中式檔案伺服器</li></ul><p><strong>使用 Git 的文件即程式碼：</strong></p><ul><li>版本歷史記錄：每次變更都被追蹤，誰/何時/為什麼</li><li>協作：分支、合併、解決衝突</li><li>離線存取：是（完整的 repo 克隆）</li><li>審核追蹤：不可變的提交歷史記錄（密碼學保護）</li><li>回滾：<code>git revert</code> — 即時恢復</li><li>分散式：每個克隆都是完整的備份</li></ul><h3 id="關鍵洞察">關鍵洞察</h3><p>Git 本質上是<strong>去中心化的</strong>。每個開發者都有文件儲存庫的完整副本。這意味著：</p><ul><li>沒有單一故障點</li><li>離線運作（對沙盒/氣隙環境至關重要）</li><li>無需登入即可閱讀</li><li>沒有廠商能挾持您的知識</li></ul><p>現在我們了解了基礎，讓我們探討為什麼這種架構在系統故障時很重要。</p><hr /><h2 id="3-操作手冊測試：凌晨三點會發生什麼？">3 操作手冊測試：凌晨三點會發生什麼？</h2><p>讓我們用文件即程式碼重播我們的開場情境。</p><p><strong>凌晨三點。生產事故。無法存取網際網路（沙盒環境）。</strong></p><h3 id="使用傳統文件：">使用傳統文件：</h3><pre class="language-none"><code class="language-none">工程師：「讓我檢查操作手冊……」  ↓開啟瀏覽器 → Confluence 登入 → 沒有網路  ↓致電隊友 → 「Wiki 密碼是什麼？」  ↓隊友：「我想在 LastPass 裡……」  ↓LastPass → 沒有網路 → 無法同步  ↓[事故升級，同時尋找憑證]</code></pre><p><strong>解決時間：</strong> 45 分鐘（包括 38 分鐘尋找文件）</p><h3 id="使用文件即程式碼：">使用文件即程式碼：</h3><pre class="language-none"><code class="language-none">工程師：「讓我檢查操作手冊……」  ↓開啟終端機 → &#96;cd runbooks&#96; → 已經本機克隆  ↓&#96;grep &quot;database failover&quot; *.md&#96; → 即時搜尋  ↓遵循程序 → 系統恢復  ↓提交事故筆記 → &#96;git commit -m &quot;Incident #2026-0329&quot;&#96;</code></pre><p><strong>解決時間：</strong> 7 分鐘（全部用於修復問題）</p><div class="admonition question"><p class="admonition-title"><span class="mdi mdi-comment-question-outline admonition-icon"></span>🤔 為什麼離線很重要？</p><div class="admonition-content"><p>您可能會想：<em>「我們一直有網際網路。這不會發生在我們身上。」</em></p><p>考慮這些情境：</p><ul><li><strong>安全事故</strong> → 調查期間限制網路存取</li><li><strong>雲端停機</strong> → 您的文件在雲端……停機了</li><li><strong>氣隙環境</strong> → 政府、金融、醫療保健沙盒</li><li><strong>旅行</strong> → 飛行模式、飯店 WiFi 不佳、國際漫遊</li><li><strong>災難恢復</strong> → 當一切都壞了，包括網際網路</li></ul><p><strong>原則：</strong> 關鍵文件應該在您最需要的時候運作——而不是在條件理想時。</p></div></div><p>看到了 DaC 在壓力下的表現，讓我們檢查使其值得採用的實際優勢。</p><hr /><h2 id="4-為什麼文件即程式碼獲勝">4 為什麼文件即程式碼獲勝</h2><h3 id="1-匯出為任何格式">1. 匯出為任何格式</h3><p>您的利害關係人想要 Word？PDF？Confluence？沒問題。</p><p><strong>自動化運作方式：</strong></p><pre class="language-none"><code class="language-none">您將變更儲存到 Git        ↓自動化檢測變更        ↓建構 PDF 版本        ↓建構 Word 版本        ↓更新網站        ↓（可選）同步到 Confluence        ↓完成 — 所有格式自動更新</code></pre><p><strong>這取代了什麼：</strong></p><table><thead><tr><th>手動流程</th><th>自動化流程</th></tr></thead><tbody><tr><td>開啟文件 → 匯出為 PDF → 儲存</td><td>一次儲存觸發所有操作</td></tr><tr><td>開啟文件 → 匯出為 Word → 電子郵件</td><td>檔案自動產生和儲存</td></tr><tr><td>登入 Confluence → 複製/貼上 → 發布</td><td>同步在背景發生</td></tr><tr><td>每次變更加以重複</td><td>一致執行，不會遺忘</td></tr></tbody></table><p><strong>實際自動化配置如下：</strong></p><pre class="language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># CI/CD 管線建構多種格式</span><span class="token key atrule">on</span><span class="token punctuation">:</span>  <span class="token key atrule">push</span><span class="token punctuation">:</span>    <span class="token key atrule">branches</span><span class="token punctuation">:</span> <span class="token punctuation">[</span>main<span class="token punctuation">]</span><span class="token key atrule">jobs</span><span class="token punctuation">:</span>  <span class="token key atrule">build-docs</span><span class="token punctuation">:</span>    <span class="token key atrule">runs-on</span><span class="token punctuation">:</span> ubuntu<span class="token punctuation">-</span>latest    <span class="token key atrule">steps</span><span class="token punctuation">:</span>      <span class="token punctuation">-</span> <span class="token key atrule">uses</span><span class="token punctuation">:</span> actions/checkout@v4      <span class="token comment"># 轉換為 PDF</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Build PDF        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//pandoc/core        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> runbook.md <span class="token punctuation">-</span>o runbook.pdf      <span class="token comment"># 轉換為 Word（給堅持的利害關係人）</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Build Word        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//pandoc/core        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> runbook.md <span class="token punctuation">-</span>o runbook.docx      <span class="token comment"># 同步到 Confluence（給還沒準備好放棄的團隊）</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Sync to Confluence        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//confluence<span class="token punctuation">-</span>publisher        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>source ./docs <span class="token punctuation">-</span><span class="token punctuation">-</span>space OPS</code></pre><p><strong>什麼是 CI/CD？</strong> 這是每當您的文件變更時執行的自動化。把它想像成機器人助理：您將變更儲存到 Git，它會自動建構 PDF、Word 文件和網站——無需手動步驟。</p><p><strong>魔法：</strong> 寫一次（Markdown），發布到各處（PDF、Word、HTML、Confluence）。</p><table><thead><tr><th>格式</th><th>使用案例</th></tr></thead><tbody><tr><td><strong>Markdown（原始碼）</strong></td><td>作者、版本控制、差異比較</td></tr><tr><td><strong>PDF</strong></td><td>正式報告、合規提交</td></tr><tr><td><strong>Word</strong></td><td>需要追蹤修訂的利害關係人</td></tr><tr><td><strong>HTML</strong></td><td>內部文件網站</td></tr><tr><td><strong>Confluence</strong></td><td>仍在遷移的團隊（臨時橋樑）</td></tr></tbody></table><hr /><h3 id="2-可掃描的安全性">2. 可掃描的安全性</h3><p>傳統文件是<strong>安全盲點</strong>：</p><pre class="language-none"><code class="language-none">🔍 安全團隊：「我們可以掃描 Word 文件是否有機密嗎？」👤 IT 管理員：「它們在檔案伺服器上。我們需要……」🔍 安全團隊：「那 Confluence 呢？」👤 IT 管理員：「那是 SaaS。您需要 API 存取和……」🔍 安全團隊：*嘆息*</code></pre><p>文件即程式碼是<strong>安全透明</strong>的：</p><p><strong>範例：掃描意外機密</strong></p><pre class="language-bash" data-language="bash"><code class="language-bash"><span class="token comment"># 掃描所有文件是否有洩漏的密碼或 API 金鑰</span>$ gitleaks detect <span class="token parameter variable">--source</span> ./docs --report-path secrets.json<span class="token comment"># 搜尋常見的敏感模式</span>$ <span class="token function">grep</span> <span class="token parameter variable">-r</span> <span class="token string">"password\|api_key\|secret"</span> ./docs/*.md<span class="token comment"># 在儲存變更前自動執行檢查</span>$ pre-commit run --all-files</code></pre><p><strong>這些命令的作用：</strong> 第一個命令執行安全性掃描器，尋找密碼、API 金鑰和權杖。第二個搜尋常見的敏感字詞。第三個在每次有人嘗試儲存變更時自動執行——在儲存之前封鎖任何可疑內容。</p><p><strong>會抓到什麼：</strong></p><table><thead><tr><th>風險</th><th>檢測方法</th></tr></thead><tbody><tr><td>意外 API 金鑰</td><td>CI 管線中的正則表達式模式</td></tr><tr><td>硬編碼密碼</td><td>機密掃描工具（gitleaks、truffleHog）</td></tr><tr><td>過期憑證</td><td>自動輪換警報</td></tr><tr><td>合規違規</td><td>策略即程式碼檢查</td></tr></tbody></table><div class="admonition tip"><p class="admonition-title"><span class="mdi mdi-lightbulb-on-outline admonition-icon"></span>💡 合規紅利</p><div class="admonition-content"><p>審核員喜歡文件即程式碼，因為：</p><ul><li><strong>不可變的歷史記錄</strong> → 誰變更了什麼，何時（密碼學保護，像區塊鏈）</li><li><strong>防篡改</strong> → 被篡改的歷史記錄會立即被檢測到</li><li><strong>批准工作流程</strong> → 拉取請求需要審查</li><li><strong>自動檢查</strong> → 策略違規會封鎖合併</li><li><strong>輕鬆匯出</strong> → 按需產生審核報告</li><li><strong>不可否認性</strong> → 無法否認您所做的變更</li></ul></div></div><hr /><h3 id="3-2029-Confluence-遷移">3. 2029 Confluence 遷移</h3><p>讓我們面對現實：<strong>Confluence Data Center 將於 2029 年 3 月 28 日變為唯讀</strong>。</p><p><strong>您的選項：</strong></p><table><thead><tr><th>選項</th><th>優點</th><th>缺點</th></tr></thead><tbody><tr><td><strong>遷移到 Confluence Cloud</strong></td><td>熟悉的 UI，最少重新培訓</td><td>☠️ 廠商鎖定加深，SaaS 定價，資料主權問題</td></tr><tr><td><strong>遷移到文件即程式碼</strong></td><td>擁有您的資料，離線存取，無廠商風險</td><td>非技術使用者的學習曲線</td></tr><tr><td><strong>遷移到另一個 Wiki</strong></td><td>類似的 UX</td><td>相同的基本問題（登入、搜尋、鎖定）</td></tr><tr><td><strong>協商延長維護</strong></td><td>爭取更多時間</td><td>臨時修復，昂貴，仍需最終遷移</td></tr></tbody></table><p><strong>Confluence Cloud 的實際成本：</strong></p><pre class="language-none"><code class="language-none">企業（1000 使用者）：  - Confluence Cloud：~$120,000&#x2F;年  - 必要附加元件：~$30,000&#x2F;年  - 遷移服務：~$50,000（一次性）  - 培訓：~$20,000  第 1 年總計：~$220,000  第 3 年總計：~$410,000</code></pre><p><strong>文件即程式碼成本：</strong></p><pre class="language-none"><code class="language-none">企業（1000 使用者）：  - Git 託管（GitHub&#x2F;GitLab）：~$20,000&#x2F;年（通常已支付）  - 靜態網站產生器：$0（開源）  - CI&#x2F;CD：$0-$10,000&#x2F;年（通常包含）  - 培訓：~$20,000（一次性）  第 1 年總計：~$50,000  第 3 年總計：~$80,000</code></pre><p><strong>3 年節省：~$330,000</strong>（而且您擁有您的資料）</p><p>財務和戰略優勢明確後，讓我們檢查文件即程式碼面臨真正挑戰的地方。</p><hr /><h2 id="5-AI-優勢：為什麼-DaC-是-AI-助理的完美選擇">5 AI 優勢：為什麼 DaC 是 AI 助理的完美選擇</h2><p>這是轉折：<strong>文件即程式碼是您能選擇的最 AI 友好的文件格式。</strong></p><h3 id="低成本，高影響">低成本，高影響</h3><p><strong>代幣經濟：</strong></p><p>AI 助理按代幣收費（大約 1 代幣 = 1 個單字）。比較：</p><table><thead><tr><th>格式</th><th>代幣數量</th><th>處理成本</th></tr></thead><tbody><tr><td><strong>Markdown 檔案</strong></td><td>~500 代幣</td><td>$0.001</td></tr><tr><td><strong>Word 文件</strong>（具有格式 XML）</td><td>~5,000 代幣</td><td>$0.010</td></tr><tr><td><strong>Confluence 頁面</strong>（HTML + 中繼資料）</td><td>~3,000 代幣</td><td>$0.006</td></tr><tr><td><strong>PDF</strong>（二進位，需要擷取）</td><td>可變 + 擷取成本</td><td>$$$</td></tr></tbody></table><p><strong>為什麼 Markdown 獲勝：</strong></p><pre class="language-none"><code class="language-none">Markdown：     &quot;# Runbook：Database Failover&quot; → 乾淨，最少代幣Word DOCX：    &quot;&lt;w:document&gt;&lt;w:p&gt;&lt;w:r&gt;&lt;w:t&gt;Runbook...&lt;&#x2F;w:t&gt;&lt;&#x2F;w:r&gt;&lt;&#x2F;w:p&gt;...&quot; → XML 膨脹Confluence：   &quot;&lt;div class&#x3D;&#39;content&#39;&gt;&lt;h1&gt;Runbook&lt;&#x2F;h1&gt;&lt;span data-...&gt;...&quot; → HTML 雜訊</code></pre><p><strong>計算：</strong> 使用 AI 更新 100 個文件頁面：</p><ul><li><strong>Markdown：</strong> API 成本約 $0.10</li><li><strong>Word/Confluence：</strong> API 成本約 $0.60-1.00</li><li><strong>節省：</strong> AI 處理成本降低 80-90%</li></ul><hr /><h3 id="命令列-AI-代理-完美搭配">命令列 + AI 代理 = 完美搭配</h3><p><strong>AI 代理喜歡命令列工具。</strong> 原因如下：</p><pre class="language-none"><code class="language-none">🤖 AI 代理：「我將更新 PostgreSQL 16 的操作手冊」步驟 1：克隆儲存庫          → &#96;git clone ...&#96;步驟 2：尋找相關檔案        → &#96;grep -r &quot;PostgreSQL&quot; docs&#x2F;&#96;步驟 3：閱讀當前內容        → &#96;cat docs&#x2F;runbooks&#x2F;db-failover.md&#96;步驟 4：產生更新的內容      → （AI 寫新版本）步驟 5：儲存變更            → &#96;git add &amp;&amp; git commit&#96;步驟 6：建立拉取請求         → &#96;gh pr create ...&#96;✅ 30 秒內完成</code></pre><p><strong>為什麼這有效：</strong></p><table><thead><tr><th>工具類型</th><th>AI 整合</th><th>範例</th></tr></thead><tbody><tr><td><strong>Git 命令</strong></td><td>本機文字 I/O</td><td>AI 透過 CLI 讀取/寫入</td></tr><tr><td><strong>grep/sed/awk</strong></td><td>簡單轉換</td><td>AI 尋找和更新模式</td></tr><tr><td><strong>pandoc</strong></td><td>格式轉換</td><td>AI 匯出為任何格式</td></tr><tr><td><strong>靜態網站產生器</strong></td><td>建構自動化</td><td>AI 本機預覽變更</td></tr></tbody></table><p><strong>與 Confluence 對比：</strong></p><pre class="language-none"><code class="language-none">🤖 AI 代理：「我將更新 Confluence 頁面……」步驟 1：透過 OAuth 驗證     → 代幣交換，API 金鑰步驟 2：透過 REST API 取得頁面   → HTTP 請求，速率限制步驟 3：解析 HTML 內容        → 移除標籤，處理編碼步驟 4：產生更新的內容      → （AI 寫新版本）步驟 5：轉換回 HTML          → 重新新增格式，巨集步驟 6：透過 API 發布         → HTTP POST，處理衝突❌ 複雜度增加 10 倍，速度慢 5 倍，API 速率限制</code></pre><hr /><h3 id="視覺圖表：Mermaid-圖表">視覺圖表：Mermaid 圖表</h3><p><strong>Markdown 現在本機支援圖表：</strong></p><pre class="language-markdown" data-language="markdown"><code class="language-markdown">flowchart LR    A[偵測到事故] --> B&#123;嚴重性？&#125;    B -->|關鍵| C[呼叫值班人員]    B -->|低| D[記錄工單]    C --> E[啟動操作手冊]    D --> F[監控]</code></pre><p><strong>渲染為：</strong></p><pre class="language-MERMAID_BASE64_630" data-language="MERMAID_BASE64_630"><code class="language-MERMAID_BASE64_630">Zmxvd2NoYXJ0IExSCiAgICBBW+WBtea4rOWIsOS6i+aVhV0gLS0+IEJ75Zq06YeN5oCn77yffQogICAgQiAtLT586Zec6Y21fCBDW+WRvOWPq+WAvOePreS6uuWToV0KICAgIEIgLS0+fOS9jnwgRFvoqJjpjITlt6Xllq5dCiAgICBDIC0tPiBFW+WVn+WLleaTjeS9nOaJi+WGil0KICAgIEQgLS0+IEZb55uj5o6nXQ&#x3D;&#x3D;</code></pre><p><strong>AI + Mermaid = 即時圖表：</strong></p><pre class="language-none"><code class="language-none">👤 使用者：「建立我們部署流程的流程圖」🤖 AI：*產生 Mermaid 程式碼*✅ 結果：專業圖表，無需設計技能</code></pre><p><strong>支援的圖表類型：</strong></p><table><thead><tr><th>類型</th><th>使用案例</th></tr></thead><tbody><tr><td>流程圖</td><td>流程文件</td></tr><tr><td>序列圖</td><td>API 互動</td></tr><tr><td>甘特圖</td><td>專案時程</td></tr><tr><td>類別圖</td><td>系統架構</td></tr><tr><td>心智圖</td><td>頭腦風暴</td></tr></tbody></table><hr /><h3 id="學習曲線正在變平">學習曲線正在變平</h3><p><strong>當時（2020）：</strong></p><pre class="language-none"><code class="language-none">👤 非技術使用者：「什麼是 Markdown？」🔧 工程師：「就像是帶有符號的純文字……」👤 非技術使用者：「我在哪裡編輯？」🔧 工程師：「您需要文字編輯器，或者也許……」😓 摩擦：高</code></pre><p><strong>現在（2026）：</strong></p><pre class="language-none"><code class="language-none">👤 非技術使用者：「我如何編輯？」選項 1：GitHub Web UI（所見即所得模式）  - 點擊編輯 → 看到格式化視圖 → 儲存選項 2：Notion（匯出為 Markdown）  - 視覺化寫作 → 匯出為 .md選項 3：Google Docs（具有 Markdown 轉換器）  - 在 Docs 中寫作 → 自動轉換為 .md選項 4：Microsoft Word（儲存為 Markdown）  - 內建支援😓 摩擦：低且正在減少</code></pre><p><strong>趨勢：</strong> 所見即所得編輯器正在<strong>新增 Markdown 支援</strong>，而不是取代它。</p><table><thead><tr><th>平台</th><th>Markdown 支援</th></tr></thead><tbody><tr><td>GitHub/GitLab</td><td>✅ 具有預覽的本機編輯器</td></tr><tr><td>Notion</td><td>✅ 匯入/匯出 Markdown</td></tr><tr><td>Obsidian</td><td>✅ Markdown 優先的知識庫</td></tr><tr><td>Microsoft Word</td><td>✅ 儲存為 Markdown（2024+）</td></tr><tr><td>Google Docs</td><td>✅ Markdown 附加元件</td></tr><tr><td>Slack</td><td>✅ Markdown 格式化</td></tr><tr><td>Discord</td><td>✅ Markdown 格式化</td></tr></tbody></table><hr /><h3 id="AI-代理普及-Markdown">AI 代理普及 Markdown</h3><p><strong>現實：</strong> 非技術使用者不再需要學習 Git 命令。</p><pre class="language-none"><code class="language-none">👤 行銷經理：「更新首頁文案」2020 工作流程：  - 學習 Git 基礎  - 克隆儲存庫  - 在文字編輯器中編輯檔案  - 執行 Git 命令  - 開啟拉取請求  - 等待審查2026 工作流程：  - 告訴 AI 代理：「更新首頁文案為 X」  - AI 建立分支，編輯檔案，開啟 PR  - 審查通知到達 Slack  - 點擊「批准」→ 完成</code></pre><p><strong>彌合差距的 AI 工具：</strong></p><table><thead><tr><th>工具</th><th>作用</th></tr></thead><tbody><tr><td><strong>GitHub Copilot</strong></td><td>建議編輯，解釋 Git 命令</td></tr><tr><td><strong>Cursor</strong></td><td>具有 Git 整合的 AI 驅動編輯器</td></tr><tr><td><strong>Claude Code</strong></td><td>自然語言 → Git 操作</td></tr><tr><td><strong>Warp</strong></td><td>解釋命令的 AI 終端機</td></tr></tbody></table><p><strong>結論：</strong> Markdown + Git 曾經是「開發者技能」。隨著 AI 代理，它正在成為<strong>通用技能</strong>——就像打字一樣。</p><hr /><h2 id="6-殘酷的真相：DaC-的掙扎之處">6 殘酷的真相：DaC 的掙扎之處</h2><p>文件即程式碼並不完美。以下是它真正不足的地方：</p><h3 id="挑戰-1：非技術協作">挑戰 1：非技術協作</h3><p><strong>問題：</strong></p><pre class="language-none"><code class="language-none">👤 行銷經理：「我如何建議編輯？」🔧 工程師：「Fork 儲存庫，建立分支，提交，開啟 PR……」👤 行銷經理：*默默地發送 Slack 訊息*</code></pre><p><strong>現實檢查：</strong> Git 有學習曲線。對於沒有開發經驗的團隊，工作流程感覺陌生。</p><p><strong>!!! success &quot;✅ AI 的一線希望」</strong><br />AI 代理正在迅速減少這種摩擦。像<strong>Claude Code</strong>、<strong>GitHub Copilot</strong>和<strong>Cursor</strong>這樣的工具現在可以：<br />- 從自然語言執行 Git 命令（「建立分支並更新操作手冊」）<br />- 用簡單的英語解釋每個命令的作用<br />- 自動產生提交訊息和拉取請求描述</p><p><strong>差距比預期更快地縮小。</strong> 2023 年需要 Git 培訓的內容，2026 年可以透過聊天完成。</p><p><strong>緩解策略：</strong></p><table><thead><tr><th>方法</th><th>如何幫助</th><th>權衡</th></tr></thead><tbody><tr><td><strong>GitHub/GitLab Web UI</strong></td><td>在瀏覽器中編輯檔案，無需 Git 知識</td><td>限於簡單的變更</td></tr><tr><td><strong>VS Code + GitLens</strong></td><td>視覺化 Git 工具，點按提交</td><td>仍需安裝工具</td></tr><tr><td><strong>指定的文件負責人</strong></td><td>技術作家管理 Git，主題專家提供內容</td><td>文件負責人的瓶頸</td></tr><tr><td><strong>混合工作流程</strong></td><td>接受 Word/Google Docs，轉換為 Markdown</td><td>額外的轉換步驟</td></tr></tbody></table><hr /><h3 id="挑戰-2：沒有內嵌評論">挑戰 2：沒有內嵌評論</h3><p><strong>問題：</strong></p><p>Confluence 和 Google Docs 擅長內嵌評論：</p><pre class="language-none"><code class="language-none">📄 Confluence 頁面：  「重新啟動資料庫服務」  └─ 💬 評論：「哪個服務？postgresql.service 還是 mysqld.service？」  └─ 💬 評論：「這個步驟在 staging 中對我失敗了」  └─ 💬 評論：「在 PR #452 中更新了命令」</code></pre><p>Markdown 檔案沒有本機的內嵌評論。</p><p><strong>解決方案：</strong></p><table><thead><tr><th>方法</th><th>運作方式</th><th>限制</th></tr></thead><tbody><tr><td><strong>拉取請求評論</strong></td><td>在審查期間評論特定行</td><td>僅在 PR 期間可見，最終文件中不可見</td></tr><tr><td><strong>GitHub/GitLab Issues</strong></td><td>將 issue 連結到文件部分</td><td>需要在系統之間導航</td></tr><tr><td><strong>HTML 註釋</strong></td><td>在 Markdown 中新增評論區塊</td><td>弄亂原始碼，未渲染</td></tr><tr><td><strong>外部工具</strong></td><td>像 GitBook、ReadMe 這樣的工具新增評論</td><td>重新引入廠商依賴</td></tr></tbody></table><hr /><h3 id="挑戰-3：視覺協作">挑戰 3：視覺協作</h3><p><strong>問題：</strong></p><p>一些團隊在視覺協作中茁壯成長：</p><pre class="language-none"><code class="language-none">🎨 Google Docs：  - 高亮文字 → 新增評論 → 分配給人員  - 看到其他人即時編輯的游標  - 建議模式 → 視覺化接受&#x2F;拒絕變更</code></pre><p>Git 本質上是<strong>非同步的</strong>。即時協作不是它的強項。</p><p><strong>什麼時候重要：</strong></p><table><thead><tr><th>情境</th><th>DaC 適合度</th><th>更好的替代方案</th></tr></thead><tbody><tr><td>技術操作手冊</td><td>✅ 優秀</td><td>—</td></tr><tr><td>API 文件</td><td>✅ 優秀</td><td>—</td></tr><tr><td>策略文件</td><td>⚠️ 中等</td><td>Google Docs（草稿）→ DaC（最終）</td></tr><tr><td>行銷內容</td><td>❌ 差</td><td>Google Docs、Notion</td></tr><tr><td>頭腦風暴會議</td><td>❌ 差</td><td>白板、Miro、FigJam</td></tr></tbody></table><hr /><h3 id="挑戰-4：「我在哪裡編輯？」問題">挑戰 4：「我在哪裡編輯？」問題</h3><p><strong>問題：</strong></p><p>新貢獻者面臨摩擦：</p><pre class="language-none"><code class="language-none">👤 新團隊成員：「我發現操作手冊中有錯字。我如何修復它？」傳統：  - 點擊「編輯」按鈕 → 輸入 → 儲存 → 完成文件即程式碼：  - 克隆儲存庫（或導航到網頁……）  - 建立分支（或在網頁上編輯）  - 進行變更  - 寫提交訊息  - 建立拉取請求  - 等待審查  - 合併（或請求合併）</code></pre><p><strong>摩擦稅：</strong> 與 wiki 式編輯相比，每次編輯需要約 5-10 個額外步驟。</p><p><strong>緩解：</strong></p><p><strong>範例：貢獻者的簡單指南</strong></p><pre class="language-markdown" data-language="markdown"><code class="language-markdown"><span class="token title important"><span class="token punctuation">#</span> .github/CONTRIBUTING.md</span><span class="token title important"><span class="token punctuation">##</span> 如何更新文件</span><span class="token title important"><span class="token punctuation">###</span> 快速修正（錯字，小變更）</span><span class="token list punctuation">1.</span> 在 GitHub 上導航到檔案<span class="token list punctuation">2.</span> 點擊 ✏️ 鉛筆圖示<span class="token list punctuation">3.</span> 進行您的變更<span class="token list punctuation">4.</span> 寫簡短描述<span class="token list punctuation">5.</span> 點擊「建議變更」<span class="token list punctuation">6.</span> 完成！我們將審查並合併。<span class="token title important"><span class="token punctuation">###</span> 較大的變更</span><span class="token list punctuation">1.</span> Fork 儲存庫<span class="token list punctuation">2.</span> 建立分支：<span class="token code-snippet code keyword">`git checkout -b fix/my-change`</span><span class="token list punctuation">3.</span> 編輯檔案<span class="token list punctuation">4.</span> 提交：<span class="token code-snippet code keyword">`git commit -m "fix：描述您的變更"`</span><span class="token list punctuation">5.</span> 推送：<span class="token code-snippet code keyword">`git push origin fix/my-change`</span><span class="token list punctuation">6.</span> 開啟拉取請求</code></pre><p>明確的指導顯著減少摩擦。</p><p>承認挑戰後，讓我們探討採用文件即程式碼的團隊的實際策略。</p><hr /><h2 id="7-讓-DaC-運作：實用指南">7 讓 DaC 運作：實用指南</h2><h3 id="從小處開始，快速獲勝">從小處開始，快速獲勝</h3><p><strong>第 1-2 週：試點專案</strong></p><pre class="language-none"><code class="language-none">📁 docs&#x2F;  └── runbooks&#x2F;      ├── incident-response.md      ├── database-failover.md      └── deployment-procedure.md</code></pre><p>選擇<strong>一個高價值、技術受眾</strong>（例如，值班工程師）。獲得他們的支持。讓他們親身體驗離線優勢。</p><hr /><h3 id="建立橋樑，而不是牆">建立橋樑，而不是牆</h3><p><strong>不要：</strong> 「Confluence 現在對我們來說死了。」</p><p><strong>要：</strong> 「讓我們在遷移期間同時執行兩者。」</p><p><strong>範例：在過渡期間自動同步到 Confluence</strong></p><pre class="language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># CI/CD 同步 DaC → Confluence（臨時）</span><span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Publish to Confluence  <span class="token key atrule">if</span><span class="token punctuation">:</span> github.ref == 'refs/heads/main'  <span class="token key atrule">uses</span><span class="token punctuation">:</span> confluence<span class="token punctuation">-</span>publisher@v1  <span class="token key atrule">with</span><span class="token punctuation">:</span>    <span class="token key atrule">space</span><span class="token punctuation">:</span> OPS    <span class="token key atrule">parent</span><span class="token punctuation">:</span> <span class="token string">"Operations Runbooks"</span></code></pre><p><strong>這的作用：</strong> 每當文件更新時，它會自動發布副本到 Confluence。這讓團隊可以繼續使用 Confluence，同時逐漸採用文件即程式碼——沒有突然的中斷。</p><p>這給利害關係人時間適應，同時證明 DaC 的價值。</p><hr /><h3 id="投資工具">投資工具</h3><p><strong>基本堆疊：</strong></p><table><thead><tr><th>工具</th><th>目的</th><th>成本</th></tr></thead><tbody><tr><td><strong>VS Code + Markdown All in One</strong></td><td>作者體驗</td><td>免費</td></tr><tr><td><strong>MkDocs + Material Theme</strong></td><td>靜態網站產生</td><td>免費</td></tr><tr><td><strong>GitHub Actions / GitLab CI</strong></td><td>建構與部署管線</td><td>免費-$</td></tr><tr><td><strong>pandoc</strong></td><td>格式轉換（PDF、Word）</td><td>免費</td></tr><tr><td><strong>gitleaks</strong></td><td>機密掃描</td><td>免費</td></tr></tbody></table><p><strong>可有可無：</strong></p><table><thead><tr><th>工具</th><th>目的</th><th>成本</th></tr></thead><tbody><tr><td><strong>GitLens</strong></td><td>視覺化 Git 歷史記錄</td><td>免費-$</td></tr><tr><td><strong>Markdownlint</strong></td><td>樣式執行</td><td>免費</td></tr><tr><td><strong>Vale</strong></td><td>語法和樣式檢查</td><td>免費</td></tr></tbody></table><hr /><h3 id="定義工作流程">定義工作流程</h3><p><strong>給工程師：</strong></p><p><strong>實際情況：</strong></p><pre class="language-bash" data-language="bash"><code class="language-bash"><span class="token comment"># 1. 為您的變更建立新分支</span><span class="token function">git</span> checkout <span class="token parameter variable">-b</span> docs/update-failover-procedure<span class="token comment"># 2. 開啟並編輯文件檔案</span>code docs/runbooks/database-failover.md<span class="token comment"># 3. 在瀏覽器中預覽外觀</span>mkdocs serve  <span class="token comment"># 開啟在 http://localhost:8000</span><span class="token comment"># 4. 將您的變更儲存到版本控制</span><span class="token function">git</span> <span class="token function">add</span> docs/runbooks/database-failover.md<span class="token function">git</span> commit <span class="token parameter variable">-m</span> <span class="token string">"docs：更新 PostgreSQL 16 的故障轉移步驟"</span><span class="token function">git</span> push origin docs/update-failover-procedure<span class="token comment"># 5. 請求團隊審查</span>gh <span class="token function">pr</span> create <span class="token parameter variable">--title</span> <span class="token string">"docs：更新故障轉移步驟"</span> <span class="token parameter variable">--body</span> <span class="token string">"更新 PG16 相容性"</span></code></pre><p><strong>翻譯：</strong> 每個命令做一件事——建立工作區、編輯檔案、預覽、儲存它，並請求隊友審查。bash 符號如 <code>#</code> 只是解釋每個步驟作用的註釋。</p><p><strong>給非工程師：</strong></p><pre class="language-none"><code class="language-none">1. 在 GitHub&#x2F;GitLab 上導航到檔案2. 點擊「編輯」（鉛筆圖示）3. 進行您的變更4. 寫下變更的簡短描述5. 點擊「建議變更」6. 團隊成員將審查並合併</code></pre><hr /><h3 id="衡量成功">衡量成功</h3><p>追蹤這些指標：</p><table><thead><tr><th>指標</th><th>DaC 之前</th><th>DaC 之後</th><th>目標</th></tr></thead><tbody><tr><td>尋找操作手冊的時間</td><td>5-10 分鐘</td><td>&lt; 1 分鐘</td><td>&lt; 30 秒</td></tr><tr><td>文件新穎度</td><td>數月過時</td><td>每次事故更新</td><td>同日</td></tr><tr><td>離線可存取性</td><td>❌ 否</td><td>✅ 是</td><td>✅ 是</td></tr><tr><td>安全掃描覆蓋率</td><td>0%</td><td>100%</td><td>100%</td></tr><tr><td>貢獻者數量</td><td>3-5 個「負責人」</td><td>10-15 個團隊成員</td><td>20+</td></tr></tbody></table><p>現在我們有了實際實施策略，讓我們檢查不同類型組織的戰略影響。</p><hr /><h2 id="8-戰略視角：誰應該採用-DaC？">8 戰略視角：誰應該採用 DaC？</h2><h3 id="完美適合-✅">完美適合 ✅</h3><table><thead><tr><th>組織類型</th><th>為什麼</th></tr></thead><tbody><tr><td><strong>DevOps/SRE 團隊</strong></td><td>已經使用 Git，重視離線存取</td></tr><tr><td><strong>安全意識強的</strong></td><td>需要審核追蹤、機密掃描</td></tr><tr><td><strong>受監管的產業</strong></td><td>合規需要版本控制</td></tr><tr><td><strong>分散式團隊</strong></td><td>跨時區非同步協作</td></tr><tr><td><strong>氣隙環境</strong></td><td>離線存取是強制性的</td></tr></tbody></table><h3 id="需要培訓的良好適合-⚠️">需要培訓的良好適合 ⚠️</h3><table><thead><tr><th>組織類型</th><th>考慮因素</th></tr></thead><tbody><tr><td><strong>傳統 IT 營運</strong></td><td>投資 Git 培訓，從試點團隊開始</td></tr><tr><td><strong>混合技術/非技術</strong></td><td>混合工作流程（Google Docs → DaC 轉換）</td></tr><tr><td><strong>Confluence 重度使用者</strong></td><td>在遷移期間並行執行</td></tr></tbody></table><h3 id="不適合-❌">不適合 ❌</h3><table><thead><tr><th>組織類型</th><th>為什麼</th></tr></thead><tbody><tr><td><strong>行銷優先文件</strong></td><td>視覺協作是核心需求</td></tr><tr><td><strong>沒有 Git 經驗 + 沒有培訓預算</strong></td><td>摩擦將扼殺採用</td></tr><tr><td><strong>已長期承諾 SaaS Wiki</strong></td><td>遷移成本可能不合理</td></tr></tbody></table><hr /><h2 id="總結：文件的十字路口">總結：文件的十字路口</h2><pre class="language-MERMAID_BASE64_631" data-language="MERMAID_BASE64_631"><code class="language-MERMAID_BASE64_631">Zmxvd2NoYXJ0IExSCiAgICBBW+S7iuaXpeaWh+S7tl0gLS0+IEJ76YG45pOH5oKo55qE6Lev5b6RfQogICAgQiAtLT4gQ1tDb25mbHVlbmNlIENsb3VkIDIwMjldCiAgICBCIC0tPiBEW+aWh+S7tuWNs+eoi+W8j+eivF0KICAgIEIgLS0+IEVb54++54uAXQoKICAgIEMgLS0+IEZbU2FhUyDkvp3os7Q8YnIvPuaMgee6jOaIkOacrDxici8+6ZyA6KaB55m75YWlXQogICAgRCAtLT4gR1vmk4HmnInmgqjnmoTos4fmlpk8YnIvPumboue3muWtmOWPljxici8+5a6J5YWo5o6D5o+PXQogICAgRSAtLT4gSFvmioDooZPlgrXli5k8YnIvPuefpeitmOa1geWksTxici8+MjAyOSDljbHmqZ9dCgogICAgc3R5bGUgRCBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEMgZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAwCiAgICBzdHlsZSBFIGZpbGw6I2ZmZWJlZSxzdHJva2U6I2M2MjgyOA&#x3D;&#x3D;</code></pre><p><strong>選擇：</strong></p><table><thead><tr><th>路徑</th><th>2026</th><th>2027</th><th>2028</th><th>2029</th></tr></thead><tbody><tr><td><strong>文件即程式碼</strong></td><td>試點與學習</td><td>擴展採用</td><td>成熟工作流程</td><td>競爭優勢</td></tr><tr><td><strong>Confluence Cloud</strong></td><td>遷移</td><td>定價增加</td><td>依賴加深</td><td>鎖定</td></tr><tr><td><strong>現狀</strong></td><td>舒適</td><td>增長痛苦</td><td>緊急問題</td><td>危機模式</td></tr></tbody></table><hr /><p><strong>結論：</strong></p><p>文件即程式碼不是關於 Markdown。是關於<strong>所有權</strong>。</p><p>當您的文件存放在 Git 中：</p><ul><li>📖 <strong>您擁有資料</strong> — 沒有廠商能挾持它</li><li>🔓 <strong>離線運作</strong> — 關鍵時刻網路故障時</li><li>🔍 <strong>可掃描</strong> — 內建安全和合規</li><li>📦 <strong>可匯出到任何地方</strong> — PDF、Word、Confluence（如果您必須）</li><li>📜 <strong>有歷史記錄</strong> — 每次變更被追蹤、可還原、可審核</li><li>🔒 <strong>防篡改</strong> — 密碼學保護，像區塊鏈</li><li>🤖 <strong>AI 就緒</strong> — 最低代幣成本，命令列友好，Mermaid 圖表</li></ul><p>挑戰是真實的——非技術協作、內嵌評論、視覺工作流程。但這些是<strong>可解決的問題</strong>，不是根本缺陷。</p><p>而在 2029 年，當 Confluence 本機版終止，您的合規團隊問*「我們的文件在哪裡？」*——您會想要一個不涉及恐慌遷移的答案。</p><p><strong>從小處開始。選擇一個操作手冊。克隆一個儲存庫。親身體驗離線優勢。</strong></p><p>因為種樹最好的時間是 20 年前。第二好的時間是在廠商關閉您的伺服器之前。</p><hr /><h2 id="進一步閱讀">進一步閱讀</h2><ul><li><a href="https://docs.github.com/">GitHub Docs：Documentation as Code</a></li><li><a href="https://www.atlassian.com/licensing/data-center-end-of-life#data-center-eol-general-questions">Atlassian Confluence End-of-Life Announcement</a></li><li><a href="https://github.com/gitleaks/gitleaks">gitleaks：Secret Scanning Tool</a></li><li><a href="https://mermaid.js.org/">Mermaid：Diagrams and Flowcharts in Markdown</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Confluence 本機版將於 2029 年終止。Word 文件藏在沒人找得到的共用硬碟。您的操作手冊被困在付費牆和登入畫面後。這就是為什麼「文件即程式碼」（基於 Git 和 Markdown）是您的逃生路線，以及為什麼時間正在倒數。</summary>
    
    
    
    <category term="Misc" scheme="https://neo01.com/categories/Misc/"/>
    
    
    <category term="GitOps" scheme="https://neo01.com/tags/GitOps/"/>
    
    <category term="Documentation" scheme="https://neo01.com/tags/Documentation/"/>
    
  </entry>
  
  <entry>
    <title>Your Documentation Is a Time Bomb: Why 2029 Will Force Your Hand (And Why You&#39;ll Thank Us)</title>
    <link href="https://neo01.com/2026/04/Document-as-Code-Why-Markdown-Git-Confluence-Word/"/>
    <id>https://neo01.com/2026/04/Document-as-Code-Why-Markdown-Git-Confluence-Word/</id>
    <published>2026-03-31T16:00:00.000Z</published>
    <updated>2026-04-01T14:54:43.650Z</updated>
    
    <content type="html"><![CDATA[<p>Picture this: It’s 3 AM. Your production system is on fire. The on-call engineer grabs the laptop, opens the runbook… and hits a login screen. No internet. No VPN. No access.</p><p>Meanwhile, the “final” version of the procedure lives in a Word document on someone’s desktop from 2023. The Confluence page? It’s out of date. The wiki? Nobody knows the password.</p><p>This isn’t a hypothetical nightmare. It’s Tuesday for thousands of teams still treating documentation like a <strong>destination</strong> instead of a <strong>deliverable</strong>.</p><p>Here’s the uncomfortable truth: <strong>Your documentation strategy is a single point of failure.</strong> And in 2029, when Atlassian shuts down Confluence on-premises forever, that failure becomes mandatory.</p><p>But there’s a way out. It’s called <strong>Document as Code</strong> (DaC). And no, it’s not just “writing in Markdown.” It’s a fundamental shift in how teams think about knowledge.</p><hr /><h2 id="1-The-Problem-Documentation-Graveyards">1 The Problem: Documentation Graveyards</h2><p>Let’s name the elephants in the room.</p><h3 id="The-Word-Document-Cemetery">The Word Document Cemetery</h3><pre class="language-none"><code class="language-none">📁 Shared Drive&#x2F;  📁 Operations&#x2F;    📄 Runbook_FINAL.docx    📄 Runbook_FINAL_v2.docx    📄 Runbook_FINAL_v2_UPDATED.docx    📄 Runbook_FINAL_v3_ACTUAL_FINAL.docx    📄 Runbook_FINAL_v3_ACTUAL_FINAL_REALLY.docx</code></pre><p><strong>The Reality:</strong></p><ul><li>Nobody knows which version is authoritative</li><li>Changes require “Track Changes” and email chains</li><li>Search? Good luck with that</li><li>Access control? Either everyone or nobody</li></ul><h3 id="The-Confluence-Trap">The Confluence Trap</h3><p>Confluence promised organized, searchable knowledge. What it delivered:</p><table><thead><tr><th>Problem</th><th>Impact</th></tr></thead><tbody><tr><td><strong>Vendor lock-in</strong></td><td>Your knowledge lives in a proprietary format</td></tr><tr><td><strong>Login required</strong></td><td>Can’t read docs without network + credentials</td></tr><tr><td><strong>Search is… optimistic</strong></td><td>Finding the right page feels like archaeology</td></tr><tr><td><strong>2029 Sunset</strong></td><td>On-premises version ends support. SaaS or nothing.</td></tr></tbody></table><div class="admonition warning"><p class="admonition-title"><span class="mdi mdi-alert-outline admonition-icon"></span>⚠️ The 2029 Deadline</p><div class="admonition-content"><p>Atlassian has announced <strong>Confluence Data Center reaches end-of-life on March 28, 2029</strong>. After that:</p><ul><li>Licenses expire and environments become <strong>read-only</strong></li><li>No security patches or bug fixes</li><li>No technical support</li><li><strong>You can view data but cannot edit or add new content</strong></li><li>Running read-only mode while connected to internet is strongly discouraged (no security updates)</li></ul><p><strong>Timeline:</strong></p><ul><li><strong>March 30, 2026</strong>: New customers can no longer purchase Data Center subscriptions</li><li><strong>March 30, 2028</strong>: Existing customers can no longer purchase new subscriptions or expand</li><li><strong>March 28, 2029</strong>: All Data Center subscriptions expire</li></ul><p>For enterprises with compliance, data sovereignty, or air-gapped environments, this isn't an upgrade—it's an ultimatum. Extended maintenance may be available by exception, but requires direct negotiation with Atlassian.</p></div></div><h3 id="The-Wiki-Wild-West">The Wiki Wild West</h3><p>Wikis started as “everyone can edit!” and became “nobody owns this.”</p><pre class="language-none"><code class="language-none">🌐 Internal Wiki  ├── 📄 Getting Started (last updated: 2021)  ├── 📄 Architecture Overview (broken images)  ├── 📄 On-Call Procedures (password: ???)  └── 📄 [404 Page Not Found]</code></pre><p><strong>The Pattern:</strong> All three approaches share the same fatal flaw—<strong>documentation is separate from the work</strong>.</p><hr /><h2 id="2-What-Is-Document-as-Code">2 What Is Document as Code?</h2><p><strong>Document as Code</strong> treats documentation like software:</p><table><thead><tr><th>Software Development</th><th>Document as Code</th></tr></thead><tbody><tr><td>Code in Git</td><td>Docs in Git</td></tr><tr><td>Pull requests for changes</td><td>Pull requests for edits</td></tr><tr><td>Code review</td><td>Content review</td></tr><tr><td>CI/CD pipelines</td><td>Build &amp; deploy pipelines</td></tr><tr><td>Version tags</td><td>Release versions</td></tr><tr><td>Rollback capability</td><td>Full history, instant revert</td></tr></tbody></table><p>But here’s what makes DaC different from “just using Markdown”:</p><h3 id="It’s-Not-Just-Markdown-It’s-Git">It’s Not Just Markdown. It’s Git.</h3><pre class="language-none"><code class="language-none">❌ &quot;We use Markdown&quot; → Files on a shared drive✅ &quot;We use Document as Code&quot; → Git-based workflow with version control</code></pre><h3 id="What-Is-Git-For-Non-Technical-Readers">What Is Git? (For Non-Technical Readers)</h3><p><strong>Git</strong> is a tool that tracks changes to files over time. Think of it like a <strong>time machine for documents</strong>.</p><p>Every time you save a change, Git takes a snapshot. You can travel back to any snapshot later—yesterday’s version, last week’s, even from a year ago. Nothing is ever lost.</p><h3 id="Why-Git-Was-Built">Why Git Was Built</h3><pre class="language-none"><code class="language-none">Problem (before Git):  👤 Person A: &quot;I&#39;m editing the file!&quot;  👤 Person B: &quot;Me too!&quot;  → Both save → One person&#39;s changes are lost 😱Solution (with Git):  👤 Person A: &quot;I&#39;m editing on my own copy&quot;  👤 Person B: &quot;Me too on my own copy&quot;  → Both finish → Git combines changes safely ✅</code></pre><h3 id="Real-World-Analogy">Real-World Analogy</h3><p><strong>Google Docs version history</strong> works similarly—every save is recorded with who changed what. Git does this too, but with three key differences:</p><ol><li><strong>Works offline</strong> — No internet needed</li><li><strong>Full copy on your computer</strong> — Every version, ever</li><li><strong>No vendor lock-in</strong> — Your data stays yours</li></ol><h3 id="What-This-Means-for-You">What This Means for You</h3><ul><li><strong>No internet?</strong> No problem. Everything is local.</li><li><strong>Server goes down?</strong> You have a full backup.</li><li><strong>Vendor disappears?</strong> Your data is yours.</li><li><strong>Made a mistake?</strong> Instant undo to any point in time.</li></ul><h3 id="Is-Git-Hard-to-Learn">Is Git Hard to Learn?</h3><p>For developers, Git is daily work—they already know it.</p><p>For non-technical users, you don’t need to learn Git commands. Modern tools (GitHub Web UI, AI assistants) handle the complexity. You just edit—Git works in the background.</p><h3 id="The-Security-Superpower-Git-Is-Like-a-Blockchain-for-Documents">The Security Superpower: Git Is Like a Blockchain for Documents</h3><p>Here’s the powerful part: <strong>Git uses cryptography to make history tamper-proof</strong>.</p><p>Every change gets a unique fingerprint (called a “hash”). This fingerprint is calculated from:</p><ul><li>The content you changed</li><li>The fingerprint of the previous change</li><li>Who made the change and when</li></ul><p>This creates a <strong>chain of fingerprints</strong>—just like blockchain. If someone tries to alter history (say, delete evidence of who approved a change), the fingerprints no longer match. The tampering is <strong>instantly detectable</strong>.</p><p>Confluence and Word can’t do this. Their logs can be modified by admins. Git’s history <strong>cannot be silently changed</strong>.</p><h3 id="Commit-Signing-Digital-Signatures-for-Changes">Commit Signing: Digital Signatures for Changes</h3><p>Git has an even stronger security feature: <strong>commit signing</strong>.</p><p>Every person gets a <strong>personal certificate</strong> (like a digital ID card). When you save a change, Git signs it with your certificate. The signature proves: <em>“This change came from me, and I approve it.”</em></p><p><strong>Real-World Analogy:</strong></p><pre class="language-none"><code class="language-none">Traditional Git commit:  👤 &quot;John approved this change&quot;  → You trust the system recorded this correctlySigned Git commit:  👤 &quot;John approved this change&quot; ✍️ [digitally signed]  → Cryptographically proven John approved it  → John&#39;s certificate validates the signature  → Cannot be faked without John&#39;s private key</code></pre><p><strong>Why signing matters:</strong></p><ul><li><strong>Prevents impersonation</strong> — Nobody can pretend to be you</li><li><strong>Legal validity</strong> — Signed commits hold up in court (like a wet signature)</li><li><strong>Supply chain security</strong> — Know exactly who approved each change</li><li><strong>Compliance</strong> — Required for some regulated industries</li></ul><p><strong>What you see in practice:</strong></p><pre class="language-none"><code class="language-none">✅ Verified commit abc123 by John Doe (john@company.com)⚠️ Unverified commit def456 by unknown@example.com</code></pre><p>GitHub and GitLab show a green “Verified” badge on signed commits. If someone tries to fake a commit from you, the signature won’t match—instantly exposed.</p><h3 id="The-Git-Difference">The Git Difference</h3><p><strong>Plain Markdown Files (without Git):</strong></p><ul><li>Version history: File timestamps only</li><li>Collaboration: Overwrite conflicts</li><li>Offline access: Yes (local files)</li><li>Audit trail: Manual logging</li><li>Rollback: “Does anyone have the old version?”</li><li>Distributed: Centralized file server</li></ul><p><strong>Document as Code with Git:</strong></p><ul><li>Version history: Every change tracked, who/when/why</li><li>Collaboration: Branches, merge, resolve conflicts</li><li>Offline access: Yes (full repo clone)</li><li>Audit trail: Immutable commit history (cryptographically secured)</li><li>Rollback: <code>git revert</code> — instant recovery</li><li>Distributed: Every clone is a complete backup</li></ul><h3 id="The-Key-Insight">The Key Insight</h3><p>Git is <strong>decentralized by design</strong>. Every developer has a complete copy of the documentation repository. This means:</p><ul><li>No single point of failure</li><li>Works offline (crucial for sandboxed/air-gapped environments)</li><li>No login required to read</li><li>No vendor can hold your knowledge hostage</li></ul><p>Now that we understand the foundation, let’s explore why this architecture matters when systems fail.</p><hr /><h2 id="3-The-Runbook-Test-What-Happens-at-3-AM">3 The Runbook Test: What Happens at 3 AM?</h2><p>Let’s replay our opening scenario with Document as Code.</p><p><strong>3 AM. Production incident. No internet access (sandboxed environment).</strong></p><h3 id="With-Traditional-Documentation">With Traditional Documentation:</h3><pre class="language-none"><code class="language-none">Engineer: &quot;Let me check the runbook...&quot;  ↓Opens browser → Confluence login → No network  ↓Calls teammate → &quot;What&#39;s the wiki password?&quot;  ↓Teammate: &quot;I think it&#39;s in LastPass...&quot;  ↓LastPass → No network → Can&#39;t sync  ↓[Incident escalates while hunting for credentials]</code></pre><p><strong>Time to resolution:</strong> 45 minutes (including 38 minutes finding docs)</p><h3 id="With-Document-as-Code">With Document as Code:</h3><pre class="language-none"><code class="language-none">Engineer: &quot;Let me check the runbook...&quot;  ↓Opens terminal → &#96;cd runbooks&#96; → Already cloned locally  ↓&#96;grep &quot;database failover&quot; *.md&#96; → Instant search  ↓Follows procedure → System recovered  ↓Commits incident notes → &#96;git commit -m &quot;Incident #2026-0329&quot;&#96;</code></pre><p><strong>Time to resolution:</strong> 7 minutes (all of it fixing the problem)</p><div class="admonition question"><p class="admonition-title"><span class="mdi mdi-comment-question-outline admonition-icon"></span>🤔 Why Does Offline Matter?</p><div class="admonition-content"><p>You might think: <em>&quot;We always have internet. This won't happen to us.&quot;</em></p><p>Consider these scenarios:</p><ul><li><strong>Security incidents</strong> → Network access restricted during investigation</li><li><strong>Cloud outages</strong> → Your docs are in the cloud... that's down</li><li><strong>Air-gapped environments</strong> → Government, finance, healthcare sandboxes</li><li><strong>Travel</strong> → Airplane mode, poor hotel WiFi, international roaming</li><li><strong>Disaster recovery</strong> → When everything is broken, including internet</li></ul><p><strong>The principle:</strong> Critical documentation should work when you need it most—not when conditions are ideal.</p></div></div><p>Having seen how DaC performs under pressure, let’s examine the practical advantages that make it worth adopting.</p><hr /><h2 id="4-Why-Document-as-Code-Wins">4 Why Document as Code Wins</h2><h3 id="1-Export-to-Any-Format">1. Export to Any Format</h3><p>Your stakeholders want Word? PDF? Confluence? No problem.</p><p><strong>Here’s how the automation works:</strong></p><pre class="language-none"><code class="language-none">You save changes to Git        ↓Automation detects the change        ↓Builds PDF version        ↓Builds Word version        ↓Updates website        ↓(Optionally) Syncs to Confluence        ↓Done — all formats updated automatically</code></pre><p><strong>What this replaces:</strong></p><table><thead><tr><th>Manual Process</th><th>Automated Process</th></tr></thead><tbody><tr><td>Open document → Export as PDF → Save</td><td>One save triggers everything</td></tr><tr><td>Open document → Export as Word → Email</td><td>Files generated and stored automatically</td></tr><tr><td>Log into Confluence → Copy/paste → Publish</td><td>Sync happens in background</td></tr><tr><td>Repeat for every change</td><td>Runs consistently, no forgetting</td></tr></tbody></table><p><strong>The actual automation configuration looks like this:</strong></p><pre class="language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># CI/CD pipeline builds multiple formats</span><span class="token key atrule">on</span><span class="token punctuation">:</span>  <span class="token key atrule">push</span><span class="token punctuation">:</span>    <span class="token key atrule">branches</span><span class="token punctuation">:</span> <span class="token punctuation">[</span>main<span class="token punctuation">]</span><span class="token key atrule">jobs</span><span class="token punctuation">:</span>  <span class="token key atrule">build-docs</span><span class="token punctuation">:</span>    <span class="token key atrule">runs-on</span><span class="token punctuation">:</span> ubuntu<span class="token punctuation">-</span>latest    <span class="token key atrule">steps</span><span class="token punctuation">:</span>      <span class="token punctuation">-</span> <span class="token key atrule">uses</span><span class="token punctuation">:</span> actions/checkout@v4            <span class="token comment"># Convert to PDF</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Build PDF        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//pandoc/core        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> runbook.md <span class="token punctuation">-</span>o runbook.pdf            <span class="token comment"># Convert to Word (for stakeholders who insist)</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Build Word        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//pandoc/core        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> runbook.md <span class="token punctuation">-</span>o runbook.docx            <span class="token comment"># Sync to Confluence (for teams not ready to quit)</span>      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Sync to Confluence        <span class="token key atrule">uses</span><span class="token punctuation">:</span> docker<span class="token punctuation">:</span>//confluence<span class="token punctuation">-</span>publisher        <span class="token key atrule">with</span><span class="token punctuation">:</span>          <span class="token key atrule">args</span><span class="token punctuation">:</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>source ./docs <span class="token punctuation">-</span><span class="token punctuation">-</span>space OPS</code></pre><p><strong>What Is CI/CD?</strong> It’s automation that runs whenever your documentation changes. Think of it like a robot assistant: you save your changes to Git, and it automatically builds the PDF, Word document, and website—no manual steps required.</p><p><strong>The Magic:</strong> Write once (Markdown), publish everywhere (PDF, Word, HTML, Confluence).</p><table><thead><tr><th>Format</th><th>Use Case</th></tr></thead><tbody><tr><td><strong>Markdown (source)</strong></td><td>Authors, version control, diffing</td></tr><tr><td><strong>PDF</strong></td><td>Formal reports, compliance submissions</td></tr><tr><td><strong>Word</strong></td><td>Stakeholders who need Track Changes</td></tr><tr><td><strong>HTML</strong></td><td>Internal documentation site</td></tr><tr><td><strong>Confluence</strong></td><td>Teams still migrating (temporary bridge)</td></tr></tbody></table><hr /><h3 id="2-Security-That-Scans">2. Security That Scans</h3><p>Traditional docs are <strong>security blind spots</strong>:</p><pre class="language-none"><code class="language-none">🔍 Security Team: &quot;Can we scan the Word documents for secrets?&quot;👤 IT Admin: &quot;They&#39;re on a file server. We&#39;d need to...&quot;🔍 Security Team: &quot;What about Confluence?&quot;👤 IT Admin: &quot;That&#39;s SaaS. You&#39;d need API access and...&quot;🔍 Security Team: *sighs*</code></pre><p>Document as Code is <strong>security-transparent</strong>:</p><p><strong>Example: Scanning for accidental secrets</strong></p><pre class="language-bash" data-language="bash"><code class="language-bash"><span class="token comment"># Scan all documentation for leaked passwords or API keys</span>$ gitleaks detect <span class="token parameter variable">--source</span> ./docs --report-path secrets.json<span class="token comment"># Search for common sensitive patterns</span>$ <span class="token function">grep</span> <span class="token parameter variable">-r</span> <span class="token string">"password\|api_key\|secret"</span> ./docs/*.md<span class="token comment"># Run checks automatically before saving changes</span>$ pre-commit run --all-files</code></pre><p><strong>What these commands do:</strong> The first command runs a security scanner that looks for passwords, API keys, and tokens. The second searches for common sensitive words. The third runs automatically every time someone tries to save changes—blocking anything suspicious before it’s stored.</p><p><strong>What Gets Caught:</strong></p><table><thead><tr><th>Risk</th><th>Detection Method</th></tr></thead><tbody><tr><td>Accidental API keys</td><td>Regex patterns in CI pipeline</td></tr><tr><td>Hardcoded passwords</td><td>Secret scanning tools (gitleaks, truffleHog)</td></tr><tr><td>Outdated credentials</td><td>Automated rotation alerts</td></tr><tr><td>Compliance violations</td><td>Policy-as-code checks</td></tr></tbody></table><div class="admonition tip"><p class="admonition-title"><span class="mdi mdi-lightbulb-on-outline admonition-icon"></span>💡 The Compliance Bonus</p><div class="admonition-content"><p>Auditors love Document as Code because:</p><ul><li><strong>Immutable history</strong> → Who changed what, when (cryptographically secured, like blockchain)</li><li><strong>Tamper-evident</strong> → Altered history is instantly detectable</li><li><strong>Approval workflow</strong> → Pull requests require review</li><li><strong>Automated checks</strong> → Policy violations block merges</li><li><strong>Easy export</strong> → Generate audit reports on demand</li><li><strong>Non-repudiation</strong> → Can't deny changes you made</li></ul></div></div><hr /><h3 id="3-The-2029-Confluence-Exodus">3. The 2029 Confluence Exodus</h3><p>Let’s address the elephant: <strong>Confluence Data Center becomes read-only on March 28, 2029</strong>.</p><p><strong>Your Options:</strong></p><table><thead><tr><th>Option</th><th>Pros</th><th>Cons</th></tr></thead><tbody><tr><td><strong>Migrate to Confluence Cloud</strong></td><td>Familiar UI, minimal retraining</td><td>☠️ Vendor lock-in deepens, SaaS pricing, data sovereignty concerns</td></tr><tr><td><strong>Migrate to Document as Code</strong></td><td>Own your data, offline access, no vendor risk</td><td>Learning curve for non-technical users</td></tr><tr><td><strong>Migrate to Another Wiki</strong></td><td>Similar UX</td><td>Same fundamental problems (login, search, lock-in)</td></tr><tr><td><strong>Negotiate Extended Maintenance</strong></td><td>Buy more time</td><td>Temporary fix, costly, still need eventual migration</td></tr></tbody></table><p><strong>The Real Cost of Confluence Cloud:</strong></p><pre class="language-none"><code class="language-none">Enterprise (1000 users):  - Confluence Cloud: ~$120,000&#x2F;year  - Required add-ons: ~$30,000&#x2F;year  - Migration services: ~$50,000 (one-time)  - Training: ~$20,000    Total Year 1: ~$220,000  Total Year 3: ~$410,000</code></pre><p><strong>Document as Code Cost:</strong></p><pre class="language-none"><code class="language-none">Enterprise (1000 users):  - Git hosting (GitHub&#x2F;GitLab): ~$20,000&#x2F;year (often already paid)  - Static site generator: $0 (open source)  - CI&#x2F;CD: $0-$10,000&#x2F;year (often included)  - Training: ~$20,000 (one-time)    Total Year 1: ~$50,000  Total Year 3: ~$80,000</code></pre><p><strong>Savings over 3 years: ~$330,000</strong> (and you own your data)</p><p>With the financial and strategic advantages clear, let’s examine where Document as Code faces genuine challenges.</p><hr /><h2 id="5-The-AI-Advantage-Why-DaC-Is-Perfect-for-AI-Assistants">5 The AI Advantage: Why DaC Is Perfect for AI Assistants</h2><p>Here’s the plot twist: <strong>Document as Code is the most AI-friendly documentation format you can choose.</strong></p><h3 id="Low-Cost-High-Impact">Low Cost, High Impact</h3><p><strong>The Token Economy:</strong></p><p>AI assistants charge by the token (roughly 1 token = 1 word). Compare:</p><table><thead><tr><th>Format</th><th>Token Count</th><th>Cost to Process</th></tr></thead><tbody><tr><td><strong>Markdown file</strong></td><td>~500 tokens</td><td>$0.001</td></tr><tr><td><strong>Word document</strong> (with formatting XML)</td><td>~5,000 tokens</td><td>$0.010</td></tr><tr><td><strong>Confluence page</strong> (HTML + metadata)</td><td>~3,000 tokens</td><td>$0.006</td></tr><tr><td><strong>PDF</strong> (binary, needs extraction)</td><td>Variable + extraction cost</td><td>$$$</td></tr></tbody></table><p><strong>Why Markdown Wins:</strong></p><pre class="language-none"><code class="language-none">Markdown:     &quot;# Runbook: Database Failover&quot; → Clean, minimal tokensWord DOCX:    &quot;&lt;w:document&gt;&lt;w:p&gt;&lt;w:r&gt;&lt;w:t&gt;Runbook...&lt;&#x2F;w:t&gt;&lt;&#x2F;w:r&gt;&lt;&#x2F;w:p&gt;...&quot; → XML bloatConfluence:   &quot;&lt;div class&#x3D;&#39;content&#39;&gt;&lt;h1&gt;Runbook&lt;&#x2F;h1&gt;&lt;span data-...&gt;...&quot; → HTML noise</code></pre><p><strong>The Math:</strong> Updating 100 documentation pages with AI:</p><ul><li><strong>Markdown:</strong> ~$0.10 in API costs</li><li><strong>Word/Confluence:</strong> ~$0.60-1.00 in API costs</li><li><strong>Savings:</strong> 80-90% lower AI processing costs</li></ul><hr /><h3 id="Command-Line-AI-Agents-Perfect-Match">Command Line + AI Agents = Perfect Match</h3><p><strong>AI agents love command-line tools.</strong> Here’s why:</p><pre class="language-none"><code class="language-none">🤖 AI Agent: &quot;I&#39;ll update the runbook for PostgreSQL 16&quot;Step 1: Clone repository          → &#96;git clone ...&#96;Step 2: Find relevant files       → &#96;grep -r &quot;PostgreSQL&quot; docs&#x2F;&#96;Step 3: Read current content      → &#96;cat docs&#x2F;runbooks&#x2F;db-failover.md&#96;Step 4: Generate updated content  → (AI writes new version)Step 5: Save changes              → &#96;git add &amp;&amp; git commit&#96;Step 6: Create pull request       → &#96;gh pr create ...&#96;✅ Done in 30 seconds</code></pre><p><strong>Why This Works:</strong></p><table><thead><tr><th>Tool Type</th><th>AI Integration</th><th>Example</th></tr></thead><tbody><tr><td><strong>Git commands</strong></td><td>Native text I/O</td><td>AI reads/writes via CLI</td></tr><tr><td><strong>grep/sed/awk</strong></td><td>Simple transformations</td><td>AI finds and updates patterns</td></tr><tr><td><strong>pandoc</strong></td><td>Format conversion</td><td>AI exports to any format</td></tr><tr><td><strong>Static site generators</strong></td><td>Build automation</td><td>AI previews changes locally</td></tr></tbody></table><p><strong>Contrast with Confluence:</strong></p><pre class="language-none"><code class="language-none">🤖 AI Agent: &quot;I&#39;ll update the Confluence page...&quot;Step 1: Authenticate via OAuth    → Token exchange, API keysStep 2: Fetch page via REST API   → HTTP request, rate limitsStep 3: Parse HTML content        → Strip tags, handle encodingStep 4: Generate updated content  → (AI writes new version)Step 5: Convert back to HTML      → Re-add formatting, macrosStep 6: Publish via API           → HTTP POST, handle conflicts❌ 10x more complex, 5x slower, API rate limits</code></pre><hr /><h3 id="Visual-Diagrams-Mermaid-Charts">Visual Diagrams: Mermaid Charts</h3><p><strong>Markdown now supports diagrams natively:</strong></p><pre class="language-markdown" data-language="markdown"><code class="language-markdown">flowchart LR    A[Incident Detected] --> B&#123;Severity?&#125;    B -->|Critical| C[Page On-Call]    B -->|Low| D[Log Ticket]    C --> E[Start Runbook]    D --> F[Monitor]</code></pre><p><strong>Renders as:</strong></p><pre class="language-MERMAID_BASE64_632" data-language="MERMAID_BASE64_632"><code class="language-MERMAID_BASE64_632">Zmxvd2NoYXJ0IExSCiAgICBBW0luY2lkZW50IERldGVjdGVkXSAtLT4gQntTZXZlcml0eT99CiAgICBCIC0tPnxDcml0aWNhbHwgQ1tQYWdlIE9uLUNhbGxdCiAgICBCIC0tPnxMb3d8IERbTG9nIFRpY2tldF0KICAgIEMgLS0+IEVbU3RhcnQgUnVuYm9va10KICAgIEQgLS0+IEZbTW9uaXRvcl0&#x3D;</code></pre><p><strong>AI + Mermaid = Instant Diagrams:</strong></p><pre class="language-none"><code class="language-none">👤 User: &quot;Create a flowchart of our deployment process&quot;🤖 AI: *generates Mermaid code*✅ Result: Professional diagram, no design skills needed</code></pre><p><strong>Supported Diagram Types:</strong></p><table><thead><tr><th>Type</th><th>Use Case</th></tr></thead><tbody><tr><td>Flowcharts</td><td>Process documentation</td></tr><tr><td>Sequence diagrams</td><td>API interactions</td></tr><tr><td>Gantt charts</td><td>Project timelines</td></tr><tr><td>Class diagrams</td><td>System architecture</td></tr><tr><td>Mind maps</td><td>Brainstorming</td></tr></tbody></table><hr /><h3 id="The-Learning-Curve-Is-Flattening">The Learning Curve Is Flattening</h3><p><strong>Then (2020):</strong></p><pre class="language-none"><code class="language-none">👤 Non-tech user: &quot;What&#39;s Markdown?&quot;🔧 Engineer: &quot;It&#39;s like plain text with symbols...&quot;👤 Non-tech user: &quot;Where do I edit?&quot;🔧 Engineer: &quot;You need a text editor, or maybe...&quot;😓 Friction: High</code></pre><p><strong>Now (2026):</strong></p><pre class="language-none"><code class="language-none">👤 Non-tech user: &quot;How do I edit?&quot;Option 1: GitHub Web UI (WYSIWYG mode)  - Click edit → See formatted view → SaveOption 2: Notion (exports to Markdown)  - Write visually → Export as .mdOption 3: Google Docs (with Markdown converter)  - Write in Docs → Auto-convert to .mdOption 4: Microsoft Word (Save as Markdown)  - Native support built-in😓 Friction: Low and decreasing</code></pre><p><strong>The Trend:</strong> WYSIWYG editors are <strong>adding Markdown support</strong>, not replacing it.</p><table><thead><tr><th>Platform</th><th>Markdown Support</th></tr></thead><tbody><tr><td>GitHub/GitLab</td><td>✅ Native editor with preview</td></tr><tr><td>Notion</td><td>✅ Import/export Markdown</td></tr><tr><td>Obsidian</td><td>✅ Markdown-first knowledge base</td></tr><tr><td>Microsoft Word</td><td>✅ Save as Markdown (2024+)</td></tr><tr><td>Google Docs</td><td>✅ Add-ons for Markdown</td></tr><tr><td>Slack</td><td>✅ Markdown formatting</td></tr><tr><td>Discord</td><td>✅ Markdown formatting</td></tr></tbody></table><hr /><h3 id="AI-Agents-Democratize-Markdown">AI Agents Democratize Markdown</h3><p><strong>The Reality:</strong> Non-technical users don’t need to learn Git commands anymore.</p><pre class="language-none"><code class="language-none">👤 Marketing Manager: &quot;Update the homepage copy&quot;2020 Workflow:  - Learn Git basics  - Clone repository  - Edit file in text editor  - Run git commands  - Open pull request  - Wait for review2026 Workflow:  - Tell AI agent: &quot;Update homepage copy to say X&quot;  - AI creates branch, edits file, opens PR  - Review notification arrives in Slack  - Click &quot;Approve&quot; → Done</code></pre><p><strong>AI Tools That Bridge the Gap:</strong></p><table><thead><tr><th>Tool</th><th>What It Does</th></tr></thead><tbody><tr><td><strong>GitHub Copilot</strong></td><td>Suggests edits, explains Git commands</td></tr><tr><td><strong>Cursor</strong></td><td>AI-powered editor with Git integration</td></tr><tr><td><strong>Claude Code</strong></td><td>Natural language → Git operations</td></tr><tr><td><strong>Warp</strong></td><td>AI terminal that explains commands</td></tr></tbody></table><p><strong>The Bottom Line:</strong> Markdown + Git was once a “developer skill.” With AI agents, it’s becoming a <strong>universal skill</strong>—just like typing.</p><hr /><h2 id="6-The-Honest-Truth-Where-DaC-Struggles">6 The Honest Truth: Where DaC Struggles</h2><p>Document as Code isn’t perfect. Here’s where it genuinely falls short:</p><h3 id="Challenge-1-Non-Technical-Collaboration">Challenge 1: Non-Technical Collaboration</h3><p><strong>The Problem:</strong></p><pre class="language-none"><code class="language-none">👤 Marketing Manager: &quot;How do I suggest an edit?&quot;🔧 Engineer: &quot;Fork the repo, create a branch, commit, open a PR...&quot;👤 Marketing Manager: *quietly sends a Slack message instead*</code></pre><p><strong>Reality Check:</strong> Git has a learning curve. For teams without development experience, the workflow feels foreign.</p><p><strong>!!! success “✅ The AI Silver Lining”</strong><br />AI agents are rapidly reducing this friction. Tools like <strong>Claude Code</strong>, <strong>GitHub Copilot</strong>, and <strong>Cursor</strong> can now:<br />- Execute Git commands from natural language (“Create a branch and update the runbook”)<br />- Explain what each command does in plain English<br />- Auto-generate commit messages and pull request descriptions</p><p><strong>The gap is closing faster than expected.</strong> What required Git training in 2023 can be done via chat in 2026.</p><p><strong>Mitigation Strategies:</strong></p><table><thead><tr><th>Approach</th><th>How It Helps</th><th>Trade-off</th></tr></thead><tbody><tr><td><strong>GitHub/GitLab Web UI</strong></td><td>Edit files in browser, no Git knowledge needed</td><td>Limited to simple changes</td></tr><tr><td><strong>VS Code + GitLens</strong></td><td>Visual Git tools, point-and-click commits</td><td>Still requires tool installation</td></tr><tr><td><strong>Designated Doc Owners</strong></td><td>Tech writers manage Git, SMEs provide content</td><td>Bottleneck at doc owners</td></tr><tr><td><strong>Hybrid workflow</strong></td><td>Accept Word/Google Docs, convert to Markdown</td><td>Extra translation step</td></tr></tbody></table><hr /><h3 id="Challenge-2-No-Inline-Comments">Challenge 2: No Inline Comments</h3><p><strong>The Problem:</strong></p><p>Confluence and Google Docs excel at inline commenting:</p><pre class="language-none"><code class="language-none">📄 Confluence Page:  &quot;Restart the database service&quot;  └─ 💬 Comment: &quot;Which service? postgresql.service or mysqld.service?&quot;  └─ 💬 Comment: &quot;This step failed for me in staging&quot;  └─ 💬 Comment: &quot;Updated command in PR #452&quot;</code></pre><p>Markdown files don’t have native inline comments.</p><p><strong>Workarounds:</strong></p><table><thead><tr><th>Method</th><th>How It Works</th><th>Limitations</th></tr></thead><tbody><tr><td><strong>Pull Request Comments</strong></td><td>Comment on specific lines during review</td><td>Only visible during PR, not in final doc</td></tr><tr><td><strong>GitHub/GitLab Issues</strong></td><td>Link issues to documentation sections</td><td>Requires navigation between systems</td></tr><tr><td><strong>HTML Annotations</strong></td><td>Add comment blocks in Markdown</td><td>Clutters source, not rendered</td></tr><tr><td><strong>External Tools</strong></td><td>Tools like GitBook, ReadMe add commenting</td><td>Reintroduces vendor dependency</td></tr></tbody></table><hr /><h3 id="Challenge-3-Visual-Collaboration">Challenge 3: Visual Collaboration</h3><p><strong>The Problem:</strong></p><p>Some teams thrive on visual collaboration:</p><pre class="language-none"><code class="language-none">🎨 Google Docs:  - Highlight text → Add comment → Assign to person  - See cursors of others editing in real-time  - Suggestion mode → Accept&#x2F;reject changes visually</code></pre><p>Git is <strong>asynchronous by design</strong>. Real-time collaboration isn’t its strength.</p><p><strong>When This Matters:</strong></p><table><thead><tr><th>Scenario</th><th>DaC Fit</th><th>Better Alternative</th></tr></thead><tbody><tr><td>Technical runbooks</td><td>✅ Excellent</td><td>—</td></tr><tr><td>API documentation</td><td>✅ Excellent</td><td>—</td></tr><tr><td>Policy documents</td><td>⚠️ Moderate</td><td>Google Docs (draft) → DaC (final)</td></tr><tr><td>Marketing content</td><td>❌ Poor</td><td>Google Docs, Notion</td></tr><tr><td>Brainstorming sessions</td><td>❌ Poor</td><td>Whiteboard, Miro, FigJam</td></tr></tbody></table><hr /><h3 id="Challenge-4-The-“Where-Do-I-Edit-”-Problem">Challenge 4: The “Where Do I Edit?” Problem</h3><p><strong>The Problem:</strong></p><p>New contributors face friction:</p><pre class="language-none"><code class="language-none">👤 New Team Member: &quot;I found a typo in the runbook. How do I fix it?&quot;Traditional:  - Click &quot;Edit&quot; button → Type → Save → DoneDocument as Code:  - Clone repo (or navigate to web UI)  - Create branch (or edit in web)  - Make change  - Write commit message  - Create pull request  - Wait for review  - Merge (or request merge)</code></pre><p><strong>The Friction Tax:</strong> Each edit requires ~5-10 extra steps compared to wiki-style editing.</p><p><strong>Mitigation:</strong></p><p><strong>Example: A simple guide for contributors</strong></p><pre class="language-markdown" data-language="markdown"><code class="language-markdown"><span class="token title important"><span class="token punctuation">#</span> .github/CONTRIBUTING.md</span><span class="token title important"><span class="token punctuation">##</span> How to Update Documentation</span><span class="token title important"><span class="token punctuation">###</span> Quick Fix (Typo, small change)</span><span class="token list punctuation">1.</span> Navigate to the file on GitHub<span class="token list punctuation">2.</span> Click the ✏️ pencil icon<span class="token list punctuation">3.</span> Make your change<span class="token list punctuation">4.</span> Write a brief description<span class="token list punctuation">5.</span> Click "Propose changes"<span class="token list punctuation">6.</span> Done! We'll review and merge.<span class="token title important"><span class="token punctuation">###</span> Larger Changes</span><span class="token list punctuation">1.</span> Fork the repository<span class="token list punctuation">2.</span> Create a branch: <span class="token code-snippet code keyword">`git checkout -b fix/my-change`</span><span class="token list punctuation">3.</span> Edit the files<span class="token list punctuation">4.</span> Commit: <span class="token code-snippet code keyword">`git commit -m "fix: describe your change"`</span><span class="token list punctuation">5.</span> Push: <span class="token code-snippet code keyword">`git push origin fix/my-change`</span><span class="token list punctuation">6.</span> Open a Pull Request</code></pre><p>Clear guidance reduces friction significantly.</p><p>Having acknowledged the challenges, let’s explore practical strategies for teams adopting Document as Code.</p><hr /><h2 id="7-Making-DaC-Work-A-Practical-Guide">7 Making DaC Work: A Practical Guide</h2><h3 id="Start-Small-Win-Fast">Start Small, Win Fast</h3><p><strong>Week 1-2: Pilot Project</strong></p><pre class="language-none"><code class="language-none">📁 docs&#x2F;  └── runbooks&#x2F;      ├── incident-response.md      ├── database-failover.md      └── deployment-procedure.md</code></pre><p>Pick <strong>one high-value, technical audience</strong> (e.g., on-call engineers). Get their buy-in. Let them experience the offline benefit firsthand.</p><hr /><h3 id="Build-the-Bridge-Not-the-Wall">Build the Bridge, Not the Wall</h3><p><strong>Don’t:</strong> “Confluence is dead to us now.”</p><p><strong>Do:</strong> “Let’s run both in parallel during migration.”</p><p><strong>Example: Automatically sync to Confluence while transitioning</strong></p><pre class="language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># CI/CD syncs DaC → Confluence (temporary)</span><span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Publish to Confluence  <span class="token key atrule">if</span><span class="token punctuation">:</span> github.ref == 'refs/heads/main'  <span class="token key atrule">uses</span><span class="token punctuation">:</span> confluence<span class="token punctuation">-</span>publisher@v1  <span class="token key atrule">with</span><span class="token punctuation">:</span>    <span class="token key atrule">space</span><span class="token punctuation">:</span> OPS    <span class="token key atrule">parent</span><span class="token punctuation">:</span> <span class="token string">"Operations Runbooks"</span></code></pre><p><strong>What this does:</strong> Every time documentation is updated, it automatically publishes a copy to Confluence. This lets teams keep using Confluence while gradually adopting Document as Code—no sudden disruption.</p><p>This gives stakeholders time to adapt while proving DaC’s value.</p><hr /><h3 id="Invest-in-Tooling">Invest in Tooling</h3><p><strong>Essential Stack:</strong></p><table><thead><tr><th>Tool</th><th>Purpose</th><th>Cost</th></tr></thead><tbody><tr><td><strong>VS Code + Markdown All in One</strong></td><td>Authoring experience</td><td>Free</td></tr><tr><td><strong>MkDocs + Material Theme</strong></td><td>Static site generation</td><td>Free</td></tr><tr><td><strong>GitHub Actions / GitLab CI</strong></td><td>Build &amp; deploy pipeline</td><td>Free-$</td></tr><tr><td><strong>pandoc</strong></td><td>Format conversion (PDF, Word)</td><td>Free</td></tr><tr><td><strong>gitleaks</strong></td><td>Secret scanning</td><td>Free</td></tr></tbody></table><p><strong>Nice to Have:</strong></p><table><thead><tr><th>Tool</th><th>Purpose</th><th>Cost</th></tr></thead><tbody><tr><td><strong>GitLens</strong></td><td>Visual Git history</td><td>Free-$</td></tr><tr><td><strong>Markdownlint</strong></td><td>Style enforcement</td><td>Free</td></tr><tr><td><strong>Vale</strong></td><td>Grammar &amp; style checking</td><td>Free</td></tr></tbody></table><hr /><h3 id="Define-the-Workflow">Define the Workflow</h3><p><strong>For Engineers:</strong></p><p><strong>What this looks like in practice:</strong></p><pre class="language-bash" data-language="bash"><code class="language-bash"><span class="token comment"># 1. Create a new branch for your changes</span><span class="token function">git</span> checkout <span class="token parameter variable">-b</span> docs/update-failover-procedure<span class="token comment"># 2. Open and edit the documentation file</span>code docs/runbooks/database-failover.md<span class="token comment"># 3. Preview how it looks in your browser</span>mkdocs serve  <span class="token comment"># Opens at http://localhost:8000</span><span class="token comment"># 4. Save your changes to version control</span><span class="token function">git</span> <span class="token function">add</span> docs/runbooks/database-failover.md<span class="token function">git</span> commit <span class="token parameter variable">-m</span> <span class="token string">"docs: update failover steps for PostgreSQL 16"</span><span class="token function">git</span> push origin docs/update-failover-procedure<span class="token comment"># 5. Request a review from your team</span>gh <span class="token function">pr</span> create <span class="token parameter variable">--title</span> <span class="token string">"docs: update failover steps"</span> <span class="token parameter variable">--body</span> <span class="token string">"Updated for PG16 compatibility"</span></code></pre><p><strong>Translation:</strong> Each command does one thing—create a workspace, edit the file, preview it, save it, and ask teammates to review. The bash symbols like <code>#</code> are just notes explaining what each step does.</p><p><strong>For Non-Engineers:</strong></p><pre class="language-none"><code class="language-none">1. Navigate to the file on GitHub&#x2F;GitLab2. Click &quot;Edit&quot; (pencil icon)3. Make your changes4. Write a brief description of what changed5. Click &quot;Propose changes&quot;6. A team member will review and merge</code></pre><hr /><h3 id="Measure-Success">Measure Success</h3><p>Track these metrics:</p><table><thead><tr><th>Metric</th><th>Before DaC</th><th>After DaC</th><th>Target</th></tr></thead><tbody><tr><td>Time to find runbook</td><td>5-10 min</td><td>&lt; 1 min</td><td>&lt; 30 sec</td></tr><tr><td>Documentation freshness</td><td>Months outdated</td><td>Updated with each incident</td><td>Same-day</td></tr><tr><td>Offline accessibility</td><td>❌ No</td><td>✅ Yes</td><td>✅ Yes</td></tr><tr><td>Security scan coverage</td><td>0%</td><td>100%</td><td>100%</td></tr><tr><td>Contributor count</td><td>3-5 “owners”</td><td>10-15 team members</td><td>20+</td></tr></tbody></table><p>Now that we have practical implementation strategies, let’s examine the strategic implications for different types of organizations.</p><hr /><h2 id="8-The-Strategic-View-Who-Should-Adopt-DaC">8 The Strategic View: Who Should Adopt DaC?</h2><h3 id="Perfect-Fit-✅">Perfect Fit ✅</h3><table><thead><tr><th>Organization Type</th><th>Why</th></tr></thead><tbody><tr><td><strong>DevOps/SRE Teams</strong></td><td>Already use Git, value offline access</td></tr><tr><td><strong>Security-Conscious</strong></td><td>Need audit trails, secret scanning</td></tr><tr><td><strong>Regulated Industries</strong></td><td>Compliance requires version control</td></tr><tr><td><strong>Distributed Teams</strong></td><td>Async collaboration across timezones</td></tr><tr><td><strong>Air-Gapped Environments</strong></td><td>Offline access is mandatory</td></tr></tbody></table><h3 id="Good-Fit-with-Training-⚠️">Good Fit with Training ⚠️</h3><table><thead><tr><th>Organization Type</th><th>Considerations</th></tr></thead><tbody><tr><td><strong>Traditional IT Operations</strong></td><td>Invest in Git training, start with pilot team</td></tr><tr><td><strong>Mixed Technical/Non-Technical</strong></td><td>Hybrid workflow (Google Docs → DaC conversion)</td></tr><tr><td><strong>Heavy Confluence Users</strong></td><td>Run parallel during migration period</td></tr></tbody></table><h3 id="Poor-Fit-❌">Poor Fit ❌</h3><table><thead><tr><th>Organization Type</th><th>Why</th></tr></thead><tbody><tr><td><strong>Marketing-First Documentation</strong></td><td>Visual collaboration is core requirement</td></tr><tr><td><strong>No Git Experience + No Training Budget</strong></td><td>Friction will kill adoption</td></tr><tr><td><strong>Already Committed to SaaS Wiki Long-Term</strong></td><td>Migration cost may not justify benefits</td></tr></tbody></table><hr /><h2 id="Summary-The-Documentation-Crossroads">Summary: The Documentation Crossroads</h2><pre class="language-MERMAID_BASE64_633" data-language="MERMAID_BASE64_633"><code class="language-MERMAID_BASE64_633">Zmxvd2NoYXJ0IExSCiAgICBBW0RvY3VtZW50YXRpb24gVG9kYXldIC0tPiBCe0Nob29zZSBZb3VyIFBhdGh9CiAgICBCIC0tPiBDW0NvbmZsdWVuY2UgQ2xvdWQgMjAyOV0KICAgIEIgLS0+IERbRG9jdW1lbnQgYXMgQ29kZV0KICAgIEIgLS0+IEVbU3RhdHVzIFF1b10KICAgIAogICAgQyAtLT4gRltTYWFTIERlcGVuZGVuY3k8YnIvPk9uZ29pbmcgQ29zdHM8YnIvPkxvZ2luIFJlcXVpcmVkXQogICAgRCAtLT4gR1tPd24gWW91ciBEYXRhPGJyLz5PZmZsaW5lIEFjY2Vzczxici8+U2VjdXJpdHkgU2Nhbm5pbmddCiAgICBFIC0tPiBIW1RlY2huaWNhbCBEZWJ0PGJyLz5Lbm93bGVkZ2UgTG9zczxici8+MjAyOSBDcmlzaXNdCiAgICAKICAgIHN0eWxlIEQgZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNjCiAgICBzdHlsZSBDIGZpbGw6I2ZmZjNlMCxzdHJva2U6I2Y1N2MwMAogICAgc3R5bGUgRSBmaWxsOiNmZmViZWUsc3Ryb2tlOiNjNjI4Mjg&#x3D;</code></pre><p><strong>The Choice:</strong></p><table><thead><tr><th>Path</th><th>2026</th><th>2027</th><th>2028</th><th>2029</th></tr></thead><tbody><tr><td><strong>Document as Code</strong></td><td>Pilot &amp; learn</td><td>Expand adoption</td><td>Mature workflow</td><td>Competitive advantage</td></tr><tr><td><strong>Confluence Cloud</strong></td><td>Migrate</td><td>Pay increases</td><td>Dependency deepens</td><td>Locked in</td></tr><tr><td><strong>Status Quo</strong></td><td>Comfortable</td><td>Growing pain</td><td>Urgent problem</td><td>Crisis mode</td></tr></tbody></table><hr /><p><strong>The Bottom Line:</strong></p><p>Document as Code isn’t about Markdown. It’s about <strong>ownership</strong>.</p><p>When your documentation lives in Git:</p><ul><li>📖 <strong>You own the data</strong> — No vendor can hold it hostage</li><li>🔓 <strong>It works offline</strong> — Critical when networks fail</li><li>🔍 <strong>It’s scannable</strong> — Security and compliance built-in</li><li>📦 <strong>It exports anywhere</strong> — PDF, Word, Confluence (if you must)</li><li>📜 <strong>It has history</strong> — Every change tracked, reversible, auditable</li><li>🔒 <strong>It’s tamper-proof</strong> — Cryptographically secured like blockchain</li><li>🤖 <strong>It’s AI-ready</strong> — Lowest token costs, command-line friendly, Mermaid diagrams</li></ul><p>The challenges are real—non-technical collaboration, inline comments, visual workflows. But these are <strong>solvable problems</strong>, not fundamental flaws.</p><p>And in 2029, when Confluence on-premises reaches end-of-life and your compliance team asks <em>“Where is our documentation?”</em>—you’ll want an answer that doesn’t involve a panic migration.</p><p><strong>Start small. Pick one runbook. Clone one repo. Experience the offline benefit firsthand.</strong></p><p>Because the best time to plant a tree was 20 years ago. The second-best time is before the vendor shuts off your server.</p><hr /><h2 id="Further-Reading">Further Reading</h2><ul><li><a href="https://docs.github.com/">GitHub Docs: Documentation as Code</a></li><li><a href="https://www.atlassian.com/licensing/data-center-end-of-life#data-center-eol-general-questions">Atlassian Confluence End-of-Life Announcement</a></li><li><a href="https://github.com/gitleaks/gitleaks">gitleaks: Secret Scanning Tool</a></li><li><a href="https://mermaid.js.org/">Mermaid: Diagrams and Flowcharts in Markdown</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Confluence on-prem dies in 2029. Word docs live in shared drives nobody can find. Your runbooks are trapped behind paywalls and login screens. Here&#39;s why Document as Code—built on Git and Markdown—is your escape route, and why the clock is ticking.</summary>
    
    
    
    <category term="Misc" scheme="https://neo01.com/categories/Misc/"/>
    
    
    <category term="GitOps" scheme="https://neo01.com/tags/GitOps/"/>
    
    <category term="Documentation" scheme="https://neo01.com/tags/Documentation/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 构建 PostgreSQL 兼容数据库：用于准确选择性估计的直方图统计</title>
    <link href="https://neo01.com/zh-CN/2026/03/Database-Rust-Histogram-Statistics/"/>
    <id>https://neo01.com/zh-CN/2026/03/Database-Rust-Histogram-Statistics/</id>
    <published>2026-03-07T16:00:00.000Z</published>
    <updated>2026-03-14T03:49:34.517Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-CN/2026/03/Database-Rust-Query-Optimizer/">第七部分</a> 中，我们构建了一个基于成本的优化器。但有个问题。</p><p><strong>我们的选择性估计是猜测：</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// From Part 7 - simplified (wrong!) estimates</span><span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>        <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token number">0.01</span><span class="token punctuation">,</span>   <span class="token comment">// Always 1%? Wrong!</span>        <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token number">0.33</span><span class="token punctuation">,</span>   <span class="token comment">// Always 33%? Wrong!</span>        _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                      <span class="token comment">// Always 50%? Very wrong!</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>真实数据不是均匀的：</strong></p><pre class="language-none"><code class="language-none">User balances in our database:Balance Distribution:$0-100:     ████████████████████████████████████████  80% of users$100-1K:    ████████                                   15% of users$1K-10K:    ██                                          4% of users$10K-100K:  █                                           1% of usersQuery: SELECT * FROM users WHERE balance &gt; 100Our estimate: 33% of rows (using fixed 0.33)Actual: 20% of rows→ Wrong plan chosen! Index scan would be better than seq scan.</code></pre><p><strong>解决方案：</strong> 捕捉实际数据分布的直方图。</p><p>今天：在 Rust 中实现直方图统计以进行准确的选择性估计。</p><hr /><h2 id="1-为什么直方图很重要">1 为什么直方图很重要</h2><h3 id="均匀分布谬误">均匀分布谬误</h3><p><strong>没有直方图，优化器假设均匀分布：</strong></p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (1,000,000 rows)</span><span class="token comment">-- Column: account_type ('free', 'premium', 'enterprise')</span><span class="token comment">-- Reality:</span><span class="token string">'free'</span>:       <span class="token number">950</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">95</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token string">'premium'</span>:     <span class="token number">45</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">4.5</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token string">'enterprise'</span>:   <span class="token number">5</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">0.5</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token comment">-- Query 1:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> account_type <span class="token operator">=</span> <span class="token string">'enterprise'</span><span class="token comment">-- Optimizer estimate (uniform): 1,000,000 / 3 = 333,333 rows</span><span class="token comment">-- Actual: 5,000 rows</span><span class="token comment">-- Error: 67x overestimate!</span><span class="token comment">-- Query 2:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> account_type <span class="token operator">=</span> <span class="token string">'free'</span><span class="token comment">-- Optimizer estimate (uniform): 333,333 rows</span><span class="token comment">-- Actual: 950,000 rows</span><span class="token comment">-- Error: 3x underestimate!</span></code></pre><p><strong>有直方图：</strong> 我们知道实际分布。</p><hr /><h3 id="对连接顺序的影响">对连接顺序的影响</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> u<span class="token punctuation">.</span>name<span class="token punctuation">,</span> o<span class="token punctuation">.</span>total<span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>account_type <span class="token operator">=</span> <span class="token string">'enterprise'</span></code></pre><p><strong>没有直方图：</strong></p><pre class="language-none"><code class="language-none">Optimizer thinks &#39;enterprise&#39; &#x3D; 333,333 rowsPlan: SeqScan(users) → Hash Join → SeqScan(orders)Cost: 5000 (wrong!)</code></pre><p><strong>有直方图：</strong></p><pre class="language-none"><code class="language-none">Histogram shows &#39;enterprise&#39; &#x3D; 5,000 rowsPlan: IndexScan(users) → Nested Loop → IndexScan(orders)Cost: 100 (correct!)</code></pre><p><strong>结果：</strong> 查询快 50 倍。</p><hr /><h2 id="2-直方图类型">2 直方图类型</h2><h3 id="等宽直方图">等宽直方图</h3><p><strong>相等的桶范围，不同的行数：</strong></p><pre class="language-none"><code class="language-none">Data: [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Equi-Width (3 buckets, range 1-14):┌─────────────────────────────────────────────────────────────┐│ Bucket 1: [1-5]     → 5 rows  │████████████████████████████││ Bucket 2: [6-10]    → 1 row   │██████                      ││ Bucket 3: [11-14]   → 4 rows  │██████████████████████      │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">EquiWidthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> min_value<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_value<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">WidthBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WidthBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">EquiWidthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">find_min_max</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket_width <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>min<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">divide</span><span class="token punctuation">(</span>num_buckets <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">WidthBucket</span> <span class="token punctuation">&#123;</span>            lower_bound<span class="token punctuation">:</span> min<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            upper_bound<span class="token punctuation">:</span> min<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            distinct_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            null_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span> num_buckets<span class="token punctuation">]</span><span class="token punctuation">;</span>        <span class="token comment">// Initialize bucket bounds</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_buckets <span class="token punctuation">&#123;</span>            buckets<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>lower_bound <span class="token operator">=</span> min<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>bucket_width<span class="token punctuation">.</span><span class="token function">multiply</span><span class="token punctuation">(</span>i <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>upper_bound <span class="token operator">=</span> min<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>bucket_width<span class="token punctuation">.</span><span class="token function">multiply</span><span class="token punctuation">(</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Count rows in each bucket</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> values <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                buckets<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>null_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">continue</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> bucket_idx <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">find_bucket_index</span><span class="token punctuation">(</span>value<span class="token punctuation">,</span> <span class="token operator">&amp;</span>min<span class="token punctuation">,</span> <span class="token operator">&amp;</span>bucket_width<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> bucket_idx <span class="token operator">&lt;</span> num_buckets <span class="token punctuation">&#123;</span>                buckets<span class="token punctuation">[</span>bucket_idx<span class="token punctuation">]</span><span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            min_value<span class="token punctuation">:</span> min<span class="token punctuation">,</span>            max_value<span class="token punctuation">:</span> max<span class="token punctuation">,</span>            num_buckets<span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">find_bucket_index</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> min<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> width<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">usize</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> value<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>min<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket <span class="token operator">=</span> offset<span class="token punctuation">.</span><span class="token function">divide</span><span class="token punctuation">(</span>width<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">floor</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        bucket<span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span>num_buckets <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>优点：</strong> 简单，适合均匀数据。</p><p><strong>缺点：</strong> 对于偏斜数据效果差（大多数桶为空，一个桶巨大）。</p><hr /><h3 id="等深直方图（PostgreSQL-风格）">等深直方图（PostgreSQL 风格）</h3><p><strong>每个桶的行数相等，范围不同：</strong></p><pre class="language-none"><code class="language-none">Data: [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Equi-Depth (3 buckets, ~3-4 rows each):┌─────────────────────────────────────────────────────────────┐│ Bucket 1: [1-4]     → 4 rows  │████████████████████████████││ Bucket 2: [5-11]    → 3 rows  │████████████████████████████││ Bucket 3: [12-14]   → 3 rows  │████████████████████████████│└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> total_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">DepthBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DepthBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> cumulative_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>  <span class="token comment">// Rows &lt;= upper_bound</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Sort values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted_values <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted_values<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> total_rows <span class="token operator">=</span> sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> target_bucket_size <span class="token operator">=</span> <span class="token punctuation">(</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> num_buckets <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> cumulative <span class="token operator">=</span> <span class="token number">0u64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> start <span class="token operator">=</span> i <span class="token operator">*</span> target_bucket_size<span class="token punctuation">;</span>            <span class="token keyword">let</span> end <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">*</span> target_bucket_size<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span>sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> start <span class="token operator">>=</span> sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> bucket_values <span class="token operator">=</span> <span class="token operator">&amp;</span>sorted_values<span class="token punctuation">[</span>start<span class="token punctuation">..</span>end<span class="token punctuation">]</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> row_count <span class="token operator">=</span> bucket_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> bucket_values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            cumulative <span class="token operator">+=</span> row_count<span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">DepthBucket</span> <span class="token punctuation">&#123;</span>                lower_bound<span class="token punctuation">:</span> bucket_values<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                upper_bound<span class="token punctuation">:</span> bucket_values<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                row_count<span class="token punctuation">,</span>                distinct_count<span class="token punctuation">,</span>                cumulative_count<span class="token punctuation">:</span> cumulative<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            num_buckets<span class="token punctuation">,</span>            total_rows<span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for equality predicate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_eq</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Assume uniform distribution within bucket</span>                <span class="token keyword">let</span> bucket_selectivity <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">/</span> bucket<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                <span class="token keyword">return</span> bucket_selectivity <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token number">0.0</span>  <span class="token comment">// Value outside histogram range</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value > X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_gt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">&lt;</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Entire bucket matches</span>                selectivity <span class="token operator">+=</span> bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Partial bucket - assume uniform within bucket</span>                <span class="token keyword">let</span> bucket_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> matching_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> fraction <span class="token operator">=</span> <span class="token punctuation">(</span>matching_range <span class="token operator">/</span> bucket_range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                selectivity <span class="token operator">+=</span> fraction <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// else: value >= bucket.upper_bound, no match</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value &lt; X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_lt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value >= X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_gte</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Entire bucket matches</span>                selectivity <span class="token operator">+=</span> bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">></span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Partial bucket</span>                <span class="token keyword">let</span> bucket_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> matching_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> fraction <span class="token operator">=</span> <span class="token punctuation">(</span>matching_range <span class="token operator">/</span> bucket_range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                selectivity <span class="token operator">+=</span> fraction <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>优点：</strong> 对于偏斜数据更好，每个桶有相等的权重。</p><p><strong>缺点：</strong> 对于有大量重复的数据，桶边界可能是任意的。</p><hr /><h3 id="压缩直方图（处理重复）">压缩直方图（处理重复）</h3><p><strong>单独处理最常见值（MCV）：</strong></p><pre class="language-none"><code class="language-none">Data: [1, 1, 1, 1, 1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Compressed Histogram:┌─────────────────────────────────────────────────────────────┐│ Most Common Values (MCV):                                   ││   Value 1: frequency &#x3D; 5&#x2F;14 &#x3D; 35.7%                        │├─────────────────────────────────────────────────────────────┤│ Histogram (for remaining values):                           ││   Bucket 1: [2-5]   → 4 rows                                ││   Bucket 2: [6-10]  → 1 row                                 ││   Bucket 3: [11-14] → 4 rows                                │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> mcv_list<span class="token punctuation">:</span> <span class="token class-name">MCVList</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> histogram<span class="token punctuation">:</span> <span class="token class-name">EquiDepthHistogram</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_fraction<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MCVList</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (value, frequency)</span>    <span class="token keyword">pub</span> total_mcv_frequency<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> num_mcv<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Handle NULLs</span>        <span class="token keyword">let</span> null_count <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> null_fraction <span class="token operator">=</span> null_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_null_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build MCV list</span>        <span class="token keyword">let</span> mcv_list <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build_mcv_list</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_null_values<span class="token punctuation">,</span> num_mcv<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build histogram for non-MCV values</span>        <span class="token keyword">let</span> mcv_values<span class="token punctuation">:</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span>             mcv_list<span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> _<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_mcv_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> non_null_values            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>mcv_values<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span>v<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> histogram <span class="token operator">=</span> <span class="token keyword">if</span> non_mcv_values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// All values are in MCV, create empty histogram</span>            <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>                num_buckets<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                total_rows<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">EquiDepthHistogram</span><span class="token punctuation">::</span><span class="token function">build</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_mcv_values<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            mcv_list<span class="token punctuation">,</span>            histogram<span class="token punctuation">,</span>            null_fraction<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">build_mcv_list</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_mcv<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">MCVList</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> value_counts<span class="token punctuation">:</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">usize</span><span class="token operator">></span> <span class="token operator">=</span>             <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> values <span class="token punctuation">&#123;</span>            <span class="token operator">*</span>value_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> mcv<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> value_counts<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        mcv<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> b<span class="token number">.1</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>a<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Sort by count descending</span>        <span class="token keyword">let</span> total_mcv_frequency <span class="token operator">=</span> mcv<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span>c <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">f64</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mcv_values <span class="token operator">=</span> mcv            <span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">take</span><span class="token punctuation">(</span>num_mcv<span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">(</span>v<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> c <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">MCVList</span> <span class="token punctuation">&#123;</span>            values<span class="token punctuation">:</span> mcv_values<span class="token punctuation">,</span>            total_mcv_frequency<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity with MCV awareness</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> op<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check MCV first</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> value <span class="token punctuation">&#123;</span>                <span class="token keyword">return</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                    <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token operator">*</span>frequency<span class="token punctuation">,</span>                    <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Neq</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> frequency<span class="token punctuation">,</span>                    _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token comment">// For range ops, MCV exact match doesn't help much</span>                        <span class="token comment">// Fall through to histogram estimation</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Not in MCV, use histogram</span>        <span class="token keyword">let</span> histogram_selectivity <span class="token operator">=</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_eq</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_lt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Adjust for non-MCV portion</span>        <span class="token keyword">let</span> non_mcv_fraction <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>total_mcv_frequency<span class="token punctuation">;</span>        histogram_selectivity <span class="token operator">*</span> non_mcv_fraction    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>优点：</strong> 两全其美 - 对于常见值准确，对于其余值有良好的分布。</p><p><strong>缺点：</strong> 更复杂，需要更多内存。</p><hr /><h2 id="3-大型表的采样">3 大型表的采样</h2><h3 id="问题：完整表扫描很昂贵">问题：完整表扫描很昂贵</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (100 million rows)</span><span class="token comment">-- Building histogram requires sorting all values</span><span class="token comment">-- Full scan approach:</span><span class="token keyword">SELECT</span> balance <span class="token keyword">FROM</span> users <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> balance<span class="token punctuation">;</span><span class="token comment">-- Time: 30 minutes (!)</span><span class="token comment">-- I/O: Read entire table</span><span class="token comment">-- Not practical for routine ANALYZE</span></code></pre><h3 id="解决方案：统计采样">解决方案：统计采样</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/sampling.rs</span><span class="token keyword">use</span> <span class="token namespace">rand<span class="token punctuation">::</span>seq<span class="token punctuation">::</span></span><span class="token class-name">SliceRandom</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">rand<span class="token punctuation">::</span></span>thread_rng<span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Sampler</span> <span class="token punctuation">&#123;</span>    sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    confidence<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Sampler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> confidence<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> sample_size<span class="token punctuation">,</span> confidence <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Reservoir sampling for streaming data</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">reservoir_sample</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token punctuation">:</span> <span class="token class-name">Clone</span><span class="token operator">></span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        stream<span class="token punctuation">:</span> <span class="token keyword">impl</span> <span class="token class-name">Iterator</span><span class="token operator">&lt;</span><span class="token class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">T</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> reservoir <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">with_capacity</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>sample_size<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> item <span class="token keyword">in</span> stream <span class="token punctuation">&#123;</span>            count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> reservoir<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>sample_size <span class="token punctuation">&#123;</span>                reservoir<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Replace with probability sample_size / count</span>                <span class="token keyword">let</span> j <span class="token operator">=</span> <span class="token function">thread_rng</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">gen_range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">..</span>count<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> j <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>sample_size <span class="token punctuation">&#123;</span>                    reservoir<span class="token punctuation">[</span>j<span class="token punctuation">]</span> <span class="token operator">=</span> item<span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        reservoir    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Simple random sampling for in-memory data</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">simple_random_sample</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token punctuation">:</span> <span class="token class-name">Clone</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">T</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> rng <span class="token operator">=</span> <span class="token function">thread_rng</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sample<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sample<span class="token punctuation">.</span><span class="token function">shuffle</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> rng<span class="token punctuation">)</span><span class="token punctuation">;</span>        sample<span class="token punctuation">.</span><span class="token function">truncate</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>sample_size<span class="token punctuation">)</span><span class="token punctuation">;</span>        sample    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Calculate confidence interval for estimate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">confidence_interval</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> sample_proportion<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span> sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// z-score for confidence level (1.96 for 95%)</span>        <span class="token keyword">let</span> z <span class="token operator">=</span> <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>confidence <span class="token punctuation">&#123;</span>            <span class="token number">0.99</span> <span class="token operator">=></span> <span class="token number">2.576</span><span class="token punctuation">,</span>            <span class="token number">0.95</span> <span class="token operator">=></span> <span class="token number">1.96</span><span class="token punctuation">,</span>            <span class="token number">0.90</span> <span class="token operator">=></span> <span class="token number">1.645</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">1.96</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> standard_error <span class="token operator">=</span> <span class="token punctuation">(</span>sample_proportion <span class="token operator">*</span> <span class="token punctuation">(</span><span class="token number">1.0</span> <span class="token operator">-</span> sample_proportion<span class="token punctuation">)</span> <span class="token operator">/</span> sample_size <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sqrt</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> margin <span class="token operator">=</span> z <span class="token operator">*</span> standard_error<span class="token punctuation">;</span>        <span class="token punctuation">(</span>sample_proportion <span class="token operator">-</span> margin<span class="token punctuation">,</span> sample_proportion <span class="token operator">+</span> margin<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage in histogram building</span><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build_from_sample</span><span class="token punctuation">(</span>        values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> sampler <span class="token operator">=</span> <span class="token class-name">Sampler</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>sample_size<span class="token punctuation">,</span> <span class="token number">0.95</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> sample <span class="token operator">=</span> sampler<span class="token punctuation">.</span><span class="token function">simple_random_sample</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build histogram from sample</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> histogram <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>sample<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Scale row counts to match full table</span>        <span class="token keyword">let</span> scale_factor <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> sample<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> histogram<span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            bucket<span class="token punctuation">.</span>row_count <span class="token operator">=</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> scale_factor<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">round</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">=</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>cumulative_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> scale_factor<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">round</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        histogram<span class="token punctuation">.</span>total_rows <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>histogram<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><h3 id="采样大小指南">采样大小指南</h3><table><thead><tr><th>表大小</th><th>建议采样</th><th>准确度 (95% CI)</th></tr></thead><tbody><tr><td>&lt; 10K rows</td><td>100% (full scan)</td><td>Exact</td></tr><tr><td>10K-1M rows</td><td>10%</td><td>±1%</td></tr><tr><td>1M-100M rows</td><td>1%</td><td>±0.1%</td></tr><tr><td>&gt; 100M rows</td><td>0.1% or 100K rows</td><td>±0.03%</td></tr></tbody></table><hr /><h2 id="4-多列统计">4 多列统计</h2><h3 id="问题：列相关性">问题：列相关性</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (1,000,000 rows)</span><span class="token comment">-- Columns: country, city</span><span class="token comment">-- Individual column statistics:</span><span class="token comment">-- country: 'US' = 50%, 'UK' = 30%, 'DE' = 20%</span><span class="token comment">-- city: 'London' = 5%, 'Berlin' = 10%, 'Munich' = 5%</span><span class="token comment">-- Query:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> country <span class="token operator">=</span> <span class="token string">'UK'</span> <span class="token operator">AND</span> city <span class="token operator">=</span> <span class="token string">'London'</span><span class="token comment">-- Optimizer (assuming independence):</span><span class="token comment">-- Selectivity = 0.30 × 0.05 = 0.015 = 1.5%</span><span class="token comment">-- Estimated rows: 15,000</span><span class="token comment">-- Reality (London only exists in UK):</span><span class="token comment">-- Actual selectivity = 5% (all London users are in UK)</span><span class="token comment">-- Actual rows: 50,000</span><span class="token comment">-- Error: 3.3x underestimate!</span></code></pre><h3 id="多列直方图">多列直方图</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MultiColumnHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> total_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MultiColumnBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> bounds<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (min, max) for each column</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MultiColumnHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Each row is [col1_value, col2_value, ...]</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> data<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Use k-d tree style partitioning for multi-dimensional buckets</span>        <span class="token keyword">let</span> buckets <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build_kd_histogram</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            columns<span class="token punctuation">,</span>            num_buckets<span class="token punctuation">,</span>            total_rows<span class="token punctuation">:</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">build_kd_histogram</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Simplified: split on dimension with highest variance</span>        <span class="token keyword">let</span> num_dims <span class="token operator">=</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> target_bucket_size <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> num_buckets<span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> buckets<span class="token punctuation">,</span> target_bucket_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>buckets<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">partition_data</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>        depth<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        remaining_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        buckets<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span>        target_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> remaining_buckets <span class="token operator">==</span> <span class="token number">1</span> <span class="token operator">||</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;=</span> target_size <span class="token punctuation">&#123;</span>            <span class="token comment">// Create leaf bucket</span>            <span class="token keyword">let</span> bucket <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">create_bucket</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>bucket<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Split on dimension with highest range</span>        <span class="token keyword">let</span> dim <span class="token operator">=</span> depth <span class="token operator">%</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> a<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>b<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mid <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>left<span class="token punctuation">,</span> right<span class="token punctuation">)</span> <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">split_at</span><span class="token punctuation">(</span>mid<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> depth <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> remaining_buckets <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">,</span> buckets<span class="token punctuation">,</span> target_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>right<span class="token punctuation">,</span> depth <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> remaining_buckets <span class="token operator">-</span> remaining_buckets <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">,</span> buckets<span class="token punctuation">,</span> target_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_bucket</span><span class="token punctuation">(</span>data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> num_dims <span class="token operator">=</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> bounds <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> dim <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_dims <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>row<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">&amp;</span>row<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            values<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            bounds<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token punctuation">(</span>values<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> values<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">MultiColumnBucket</span> <span class="token punctuation">&#123;</span>            bounds<span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            distinct_count<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token keyword">usize</span><span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> bucket_selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_bucket_selectivity</span><span class="token punctuation">(</span>bucket<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>            selectivity <span class="token operator">+=</span> bucket_selectivity <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_bucket_selectivity</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        bucket<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">MultiColumnBucket</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token keyword">usize</span><span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> bucket_selectivity <span class="token operator">=</span> <span class="token number">1.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>col_idx<span class="token punctuation">,</span> op<span class="token punctuation">,</span> value<span class="token punctuation">)</span> <span class="token keyword">in</span> predicates <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> <span class="token operator">*</span>col_idx <span class="token operator">>=</span> bucket<span class="token punctuation">.</span>bounds<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">continue</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>bounds<span class="token punctuation">[</span><span class="token operator">*</span>col_idx<span class="token punctuation">]</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> col_selectivity <span class="token operator">=</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> value <span class="token operator">>=</span> min <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> max <span class="token punctuation">&#123;</span>                        <span class="token number">1.0</span> <span class="token operator">/</span> bucket<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> value <span class="token operator">&lt;</span> min <span class="token punctuation">&#123;</span>                        <span class="token number">1.0</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">>=</span> max <span class="token punctuation">&#123;</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">let</span> range <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>min<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> matching <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token punctuation">(</span>matching <span class="token operator">/</span> range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// ... handle other operators</span>                _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            bucket_selectivity <span class="token operator">*=</span> col_selectivity<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        bucket_selectivity    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>何时使用多列统计：</strong></p><table><thead><tr><th>情境</th><th>建议</th></tr></thead><tbody><tr><td>Columns are independent</td><td>Single-column histograms</td></tr><tr><td>Strong correlation (city → country)</td><td>Multi-column histogram</td></tr><tr><td>Frequently used together in WHERE</td><td>Multi-column histogram</td></tr><tr><td>High cardinality combination</td><td>May not be worth it</td></tr></tbody></table><hr /><h2 id="5-在优化器中使用直方图">5 在优化器中使用直方图</h2><h3 id="与成本模型集成">与成本模型集成</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/histogram_selectivity.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">HistogramSelectivityEstimator</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">HistogramSelectivityEstimator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> statistics <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> table_stats <span class="token operator">=</span> <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=></span> stats<span class="token punctuation">,</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// No stats, use default</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> col_stats <span class="token operator">=</span> <span class="token keyword">match</span> table_stats<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>column<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=></span> stats<span class="token punctuation">,</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">match</span> <span class="token operator">&amp;</span>col_stats<span class="token punctuation">.</span>histogram <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">Compressed</span><span class="token punctuation">(</span>comp<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span>op<span class="token punctuation">,</span> right<span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">InList</span> <span class="token punctuation">&#123;</span> list<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token comment">// Sum selectivity for each value</span>                        list<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span><span class="token punctuation">)</span>                            <span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Between</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">,</span> high<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">let</span> low_sel <span class="token operator">=</span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span><span class="token punctuation">,</span> low<span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> high_sel <span class="token operator">=</span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span><span class="token punctuation">,</span> high<span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token punctuation">(</span>low_sel <span class="token operator">+</span> high_sel <span class="token operator">-</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">EquiDepth</span><span class="token punctuation">(</span>equi<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_eq</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_lt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> equi<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// No histogram, fall back to MCV or default</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_from_mcv</span><span class="token punctuation">(</span>col_stats<span class="token punctuation">,</span> expr<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_from_mcv</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> col_stats<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">ColumnStatistics</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Check if value is in MCV list</span>                <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span>col_stats<span class="token punctuation">.</span>most_common_values <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> right <span class="token punctuation">&#123;</span>                        <span class="token keyword">return</span> <span class="token operator">*</span>frequency<span class="token punctuation">;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Not in MCV, estimate based on distinct count</span>                <span class="token number">1.0</span> <span class="token operator">/</span> col_stats<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// Default for other operators</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Updated cost model using histograms</span><span class="token keyword">impl</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity_with_histograms</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        estimator<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">HistogramSelectivityEstimator</span><span class="token punctuation">,</span>        table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>        expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Find which side is a column reference</span>                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">)</span> <span class="token operator">=</span> left<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">return</span> estimator<span class="token punctuation">.</span><span class="token function">estimate</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token operator">&amp;</span>ident<span class="token punctuation">.</span>value<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                        op<span class="token punctuation">:</span> op<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        right<span class="token punctuation">:</span> right<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Default estimate</span>                <span class="token number">0.5</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">InList</span> <span class="token punctuation">&#123;</span> expr<span class="token punctuation">,</span> list<span class="token punctuation">,</span> negated <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> base_selectivity <span class="token operator">=</span> list<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> estimator<span class="token punctuation">.</span><span class="token function">estimate</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token string">"column"</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">f64</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span><span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token operator">*</span>negated <span class="token punctuation">&#123;</span> <span class="token number">1.0</span> <span class="token operator">-</span> base_selectivity <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span> base_selectivity <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="示例：真实查询优化">示例：真实查询优化</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table statistics (from ANALYZE):</span><span class="token comment">-- users: 1,000,000 rows</span><span class="token comment">-- users.balance: histogram shows 80% have balance &lt; $100</span><span class="token comment">-- Query:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> balance <span class="token operator">></span> <span class="token number">100</span><span class="token comment">-- Without histogram:</span><span class="token comment">-- Selectivity estimate: 0.33 (fixed guess)</span><span class="token comment">-- Estimated rows: 333,333</span><span class="token comment">-- Chosen plan: SeqScan (cost: 5000)</span><span class="token comment">-- With histogram:</span><span class="token comment">-- Selectivity estimate: 0.20 (from histogram)</span><span class="token comment">-- Estimated rows: 200,000</span><span class="token comment">-- Chosen plan: IndexScan (cost: 2000) ← Better!</span></code></pre><hr /><h2 id="6-直方图维护">6 直方图维护</h2><h3 id="何时更新统计">何时更新统计</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/stats_maintenance.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatsMaintenancePolicy</span> <span class="token punctuation">&#123;</span>    auto_analyze_threshold<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>  <span class="token comment">// Rows modified before auto-analyze</span>    auto_analyze_scale_factor<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// Fraction of table</span>    last_analyze_threshold<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Duration</span><span class="token punctuation">,</span>  <span class="token comment">// Max time since last analyze</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Default</span> <span class="token keyword">for</span> <span class="token class-name">StatsMaintenancePolicy</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            auto_analyze_threshold<span class="token punctuation">:</span> <span class="token number">50</span><span class="token punctuation">,</span>  <span class="token comment">// PostgreSQL default</span>            auto_analyze_scale_factor<span class="token punctuation">:</span> <span class="token number">0.2</span><span class="token punctuation">,</span>  <span class="token comment">// 20% of table</span>            last_analyze_threshold<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Duration</span><span class="token punctuation">::</span><span class="token function">days</span><span class="token punctuation">(</span><span class="token number">7</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatsMaintenanceManager</span> <span class="token punctuation">&#123;</span>    catalog<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Catalog</span><span class="token operator">></span><span class="token punctuation">,</span>    policy<span class="token punctuation">:</span> <span class="token class-name">StatsMaintenancePolicy</span><span class="token punctuation">,</span>    modification_counts<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token keyword">u64</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// table → rows modified</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatsMaintenanceManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">check_and_analyze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">bool</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> table_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">get_table_statistics</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> modified_rows <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">copied</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> needs_analyze <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=</span> table_stats <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> threshold <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>policy<span class="token punctuation">.</span>auto_analyze_threshold                <span class="token operator">+</span> <span class="token punctuation">(</span>stats<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>policy<span class="token punctuation">.</span>auto_analyze_scale_factor<span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            modified_rows <span class="token operator">>=</span> threshold        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token boolean">true</span>  <span class="token comment">// No stats, need initial analyze</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> needs_analyze <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token boolean">false</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">record_modification</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> rows_affected<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> count <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">*</span>count <span class="token operator">+=</span> rows_affected<span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Run ANALYZE on the table</span>        <span class="token keyword">let</span> analyzer <span class="token operator">=</span> <span class="token class-name">StatisticsAnalyzer</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> stats <span class="token operator">=</span> analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">store_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stats<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="增量直方图更新">增量直方图更新</h3><p><strong>问题：</strong> 对于大型表，完整重建很昂贵。</p><p><strong>解决方案：</strong> 对于小变更进行增量更新。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Update histogram with new values (for small changes)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> new_values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> deleted_values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Update total row count</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token operator">+=</span> new_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token operator">-=</span> deleted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token comment">// For each new value, find its bucket and increment count</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> new_values <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                    bucket<span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                    bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                    <span class="token keyword">break</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// For deleted values, decrement counts</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> deleted_values <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                    bucket<span class="token punctuation">.</span>row_count <span class="token operator">=</span> bucket<span class="token punctuation">.</span>row_count<span class="token punctuation">.</span><span class="token function">saturating_sub</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">=</span> bucket<span class="token punctuation">.</span>cumulative_count<span class="token punctuation">.</span><span class="token function">saturating_sub</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">break</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// If too many changes, trigger full rebuild</span>        <span class="token keyword">let</span> total_changes <span class="token operator">=</span> new_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> deleted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> total_changes <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">></span> <span class="token number">0.1</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Mark for full rebuild</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>needs_rebuild <span class="token operator">=</span> <span class="token boolean">true</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-构建的挑战">7 用 Rust 构建的挑战</h2><h3 id="挑战-1：值比较">挑战 1：值比较</h3><p><strong>问题：</strong> SQL 值可以是不同类型（int、float、string、date）。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work - can't compare different types</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Integer</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Float</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">String</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Date</span><span class="token punctuation">(</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">NaiveDate</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Ord</span> <span class="token keyword">for</span> <span class="token class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">Self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Ordering</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Can't compare Integer with String!</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：类型感知比较</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - handle type mismatches</span><span class="token keyword">impl</span> <span class="token class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">compare</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">Self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Ordering</span><span class="token punctuation">,</span> <span class="token class-name">ValueError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> a<span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">(</span><span class="token operator">*</span>a <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> a<span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token operator">*</span>b <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">TypeMismatch</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-2：内存使用">挑战 2：内存使用</h3><p><strong>问题：</strong> 许多列的直方图可能使用大量内存。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Memory explosion</span><span class="token comment">// 1000 tables × 20 columns × 100 buckets × 32 bytes = 64 MB</span></code></pre><p><strong>解决方案：压缩桶边界</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Compressed representation</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound_idx<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>  <span class="token comment">// Index into shared value pool</span>    <span class="token keyword">pub</span> upper_bound_idx<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    shared_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Deduplicated values</span>    buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">CompressedBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-3：线程安全统计">挑战 3：线程安全统计</h3><p><strong>问题：</strong> 统计需要在 ANALYZE 更新期间在查询计划期间读取。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    tables<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">TableStatistics</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：Arc&lt;RwLock&lt;&gt;&gt; 用于并发访问</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Thread-safe</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    tables<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">RwLock</span><span class="token operator">&lt;</span><span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token operator">>></span><span class="token operator">>></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">get_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token operator">>></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> tables <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tables<span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        tables<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> stats<span class="token punctuation">:</span> <span class="token class-name">TableStatistics</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> tables <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tables<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        tables<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>stats<span class="token punctuation">.</span>table_name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Arc</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-AI-如何加速这项工作">8 AI 如何加速这项工作</h2><h3 id="AI-做对了什么">AI 做对了什么</h3><table><thead><tr><th>任务</th><th>AI 贡献</th></tr></thead><tbody><tr><td><strong>直方图类型</strong></td><td>解释等宽 vs. 等深权衡</td></tr><tr><td><strong>MCV 处理</strong></td><td>建议单独 MCV 列表用于常见值</td></tr><tr><td><strong>采样</strong></td><td>用于流的蓄水池采样算法</td></tr><tr><td><strong>选择性公式</strong></td><td>桶内范围估计的正确公式</td></tr></tbody></table><hr /><h3 id="AI-做错了什么">AI 做错了什么</h3><table><thead><tr><th>问题</th><th>发生什么事</th></tr></thead><tbody><tr><td><strong>桶边界</strong></td><td>初稿在边界计算中有差一错误</td></tr><tr><td><strong>多列相关性</strong></td><td>最初假设独立（对于相关列错误）</td></tr><tr><td><strong>NULL 处理</strong></td><td>忘记单独追踪 NULL 分数</td></tr><tr><td><strong>增量更新</strong></td><td>建议对任何变更进行完整重建（太昂贵）</td></tr></tbody></table><p><strong>模式：</strong> AI 处理教科书案例良好。边界情况（边界、NULL、增量）需要手动精炼。</p><hr /><h3 id="示例：调试选择性">示例：调试选择性</h3><p><strong>我问 AI 的问题：</strong></p><blockquote><p>“直方图显示 20% 的行有 balance &gt; 100，但优化器估计 5%。为什么？”</p></blockquote><p><strong>我学到的：</strong></p><ol><li>MCV 列表没有被首先检查</li><li>缺少非 MCV 分数调整</li><li>桶内范围估计颠倒了</li></ol><p><strong>结果：</strong> 修复选择性估计：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> op<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Check MCV first</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> value <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token operator">*</span>frequency<span class="token punctuation">,</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Neq</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> frequency<span class="token punctuation">,</span>                _ <span class="token operator">=></span> <span class="token number">0.0</span>  <span class="token comment">// Fall through for range ops</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Use histogram for non-MCV values</span>    <span class="token keyword">let</span> histogram_sel <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> non_mcv_fraction <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>total_mcv_frequency<span class="token punctuation">;</span>    histogram_sel <span class="token operator">*</span> non_mcv_fraction  <span class="token comment">// ← Was missing this!</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="总结：直方图统计一张图">总结：直方图统计一张图</h2><pre class="language-MERMAID_BASE64_595" data-language="MERMAID_BASE64_595"><code class="language-MERMAID_BASE64_595">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiRGF0YSBDb2xsZWN0aW9uIgogICAgICAgIEFbVGFibGUgRGF0YV0gLS0+IEJ7U2FtcGxlIG9yIEZ1bGw&#x2F;fQogICAgICAgIEIgLS0+fFNtYWxsIHRhYmxlfCBDW0Z1bGwgU2Nhbl0KICAgICAgICBCIC0tPnxMYXJnZSB0YWJsZXwgRFtSZXNlcnZvaXIgU2FtcGxpbmddCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIkhpc3RvZ3JhbSBCdWlsZGluZyIKICAgICAgICBDIC0tPiBFW1NvcnQgVmFsdWVzXQogICAgICAgIEQgLS0+IEUKICAgICAgICBFIC0tPiBGe0J1aWxkIFR5cGU&#x2F;fQogICAgICAgIEYgLS0+fFVuaWZvcm0gZGF0YXwgR1tFcXVpLVdpZHRoXQogICAgICAgIEYgLS0+fFNrZXdlZCBkYXRhfCBIW0VxdWktRGVwdGhdCiAgICAgICAgRiAtLT58V2l0aCBkdXBsaWNhdGVzfCBJW0NvbXByZXNzZWQgKyBNQ1ZdCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIk11bHRpLUNvbHVtbiIKICAgICAgICBKW0NvcnJlbGF0ZWQgQ29sdW1uc10gLS0+IEtbTXVsdGktRGltZW5zaW9uYWwgQnVja2V0c10KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiVXNhZ2UgaW4gT3B0aW1pemVyIgogICAgICAgIEggLS0+IExbU2VsZWN0aXZpdHkgRXN0aW1hdGlvbl0KICAgICAgICBJIC0tPiBMCiAgICAgICAgSyAtLT4gTAogICAgICAgIEwgLS0+IE1bQ29zdCBDYWxjdWxhdGlvbl0KICAgICAgICBNIC0tPiBOW1BsYW4gU2VsZWN0aW9uXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJNYWludGVuYW5jZSIKICAgICAgICBPW0lOU0VSVC9VUERBVEUvREVMRVRFXSAtLT4gUFtUcmFjayBNb2RpZmljYXRpb25zXQogICAgICAgIFAgLS0+IFF7VGhyZXNob2xkIFJlYWNoZWQ&#x2F;fQogICAgICAgIFEgLS0+fFllc3wgUltBdXRvLUFOQUxZWkVdCiAgICAgICAgUSAtLT58Tm98IFNbSW5jcmVtZW50YWwgVXBkYXRlXQogICAgICAgIFIgLS0+IEUKICAgIGVuZAogICAgCiAgICBzdHlsZSBBIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgTiBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEwgZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAw</code></pre><p><strong>关键要点：</strong></p><table><thead><tr><th>概念</th><th>为什么重要</th></tr></thead><tbody><tr><td><strong>等深直方图</strong></td><td>对于偏斜数据比等宽更好</td></tr><tr><td><strong>MCV 列表</strong></td><td>对于常见值准确</td></tr><tr><td><strong>采样</strong></td><td>使 ANALYZE 对于大型表实用</td></tr><tr><td><strong>多列统计</strong></td><td>捕捉列相关性</td></tr><tr><td><strong>增量更新</strong></td><td>避免对小变更进行完整重建</td></tr><tr><td><strong>线程安全访问</strong></td><td>允许更新期间并发读取</td></tr></tbody></table><hr /><p><strong>进一步阅读：</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/commands/analyze.c"><code>src/backend/commands/analyze.c</code></a></li><li>“Random Sampling for Histogram Construction” by Vitter et al.</li><li>“Selectivity Estimation for Range Queries” by Ioannidis et al.</li><li>PostgreSQL Documentation: “Statistics Used by the Planner”</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 之旅第八部分：深入探讨直方图统计。构建等深直方图、处理偏斜数据、多列统计，以及在基于成本的优化中使用直方图进行准确的选择性估计。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>Database in Rust: Histogram Statistics for Accurate Selectivity Estimation</title>
    <link href="https://neo01.com/2026/03/Database-Rust-Histogram-Statistics/"/>
    <id>https://neo01.com/2026/03/Database-Rust-Histogram-Statistics/</id>
    <published>2026-03-07T16:00:00.000Z</published>
    <updated>2026-03-14T03:03:03.818Z</updated>
    
    <content type="html"><![CDATA[<p>In <a href="/2026/03/Database-Rust-Query-Optimizer/">Part 7</a>, we built a cost-based optimizer. But there’s a problem.</p><p><strong>Our selectivity estimates are guesses:</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// From Part 7 - simplified (wrong!) estimates</span><span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>        <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token number">0.01</span><span class="token punctuation">,</span>   <span class="token comment">// Always 1%? Wrong!</span>        <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token number">0.33</span><span class="token punctuation">,</span>   <span class="token comment">// Always 33%? Wrong!</span>        _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                      <span class="token comment">// Always 50%? Very wrong!</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Real data isn’t uniform:</strong></p><pre class="language-none"><code class="language-none">User balances in our database:Balance Distribution:$0-100:     ████████████████████████████████████████  80% of users$100-1K:    ████████                                   15% of users$1K-10K:    ██                                          4% of users$10K-100K:  █                                           1% of usersQuery: SELECT * FROM users WHERE balance &gt; 100Our estimate: 33% of rows (using fixed 0.33)Actual: 20% of rows→ Wrong plan chosen! Index scan would be better than seq scan.</code></pre><p><strong>Solution:</strong> Histograms that capture actual data distribution.</p><p>Today: implementing histogram statistics in Rust for accurate selectivity estimation.</p><hr /><h2 id="1-Why-Histograms-Matter">1 Why Histograms Matter</h2><h3 id="The-Uniform-Distribution-Fallacy">The Uniform Distribution Fallacy</h3><p><strong>Without histograms, optimizers assume uniform distribution:</strong></p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (1,000,000 rows)</span><span class="token comment">-- Column: account_type ('free', 'premium', 'enterprise')</span><span class="token comment">-- Reality:</span><span class="token string">'free'</span>:       <span class="token number">950</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">95</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token string">'premium'</span>:     <span class="token number">45</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">4.5</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token string">'enterprise'</span>:   <span class="token number">5</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">0.5</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token comment">-- Query 1:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> account_type <span class="token operator">=</span> <span class="token string">'enterprise'</span><span class="token comment">-- Optimizer estimate (uniform): 1,000,000 / 3 = 333,333 rows</span><span class="token comment">-- Actual: 5,000 rows</span><span class="token comment">-- Error: 67x overestimate!</span><span class="token comment">-- Query 2:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> account_type <span class="token operator">=</span> <span class="token string">'free'</span><span class="token comment">-- Optimizer estimate (uniform): 333,333 rows</span><span class="token comment">-- Actual: 950,000 rows</span><span class="token comment">-- Error: 3x underestimate!</span></code></pre><p><strong>With histograms:</strong> We know the actual distribution.</p><hr /><h3 id="Impact-on-Join-Order">Impact on Join Order</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> u<span class="token punctuation">.</span>name<span class="token punctuation">,</span> o<span class="token punctuation">.</span>total<span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>account_type <span class="token operator">=</span> <span class="token string">'enterprise'</span></code></pre><p><strong>Without histogram:</strong></p><pre class="language-none"><code class="language-none">Optimizer thinks &#39;enterprise&#39; &#x3D; 333,333 rowsPlan: SeqScan(users) → Hash Join → SeqScan(orders)Cost: 5000 (wrong!)</code></pre><p><strong>With histogram:</strong></p><pre class="language-none"><code class="language-none">Histogram shows &#39;enterprise&#39; &#x3D; 5,000 rowsPlan: IndexScan(users) → Nested Loop → IndexScan(orders)Cost: 100 (correct!)</code></pre><p><strong>Result:</strong> 50x faster query.</p><hr /><h2 id="2-Histogram-Types">2 Histogram Types</h2><h3 id="Equi-Width-Histograms">Equi-Width Histograms</h3><p><strong>Equal bucket ranges, varying row counts:</strong></p><pre class="language-none"><code class="language-none">Data: [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Equi-Width (3 buckets, range 1-14):┌─────────────────────────────────────────────────────────────┐│ Bucket 1: [1-5]     → 5 rows  │████████████████████████████││ Bucket 2: [6-10]    → 1 row   │██████                      ││ Bucket 3: [11-14]   → 4 rows  │██████████████████████      │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">EquiWidthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> min_value<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_value<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">WidthBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WidthBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">EquiWidthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">find_min_max</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket_width <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>min<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">divide</span><span class="token punctuation">(</span>num_buckets <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">WidthBucket</span> <span class="token punctuation">&#123;</span>            lower_bound<span class="token punctuation">:</span> min<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            upper_bound<span class="token punctuation">:</span> min<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            distinct_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            null_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span> num_buckets<span class="token punctuation">]</span><span class="token punctuation">;</span>        <span class="token comment">// Initialize bucket bounds</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_buckets <span class="token punctuation">&#123;</span>            buckets<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>lower_bound <span class="token operator">=</span> min<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>bucket_width<span class="token punctuation">.</span><span class="token function">multiply</span><span class="token punctuation">(</span>i <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>upper_bound <span class="token operator">=</span> min<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>bucket_width<span class="token punctuation">.</span><span class="token function">multiply</span><span class="token punctuation">(</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Count rows in each bucket</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> values <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                buckets<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>null_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">continue</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> bucket_idx <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">find_bucket_index</span><span class="token punctuation">(</span>value<span class="token punctuation">,</span> <span class="token operator">&amp;</span>min<span class="token punctuation">,</span> <span class="token operator">&amp;</span>bucket_width<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> bucket_idx <span class="token operator">&lt;</span> num_buckets <span class="token punctuation">&#123;</span>                buckets<span class="token punctuation">[</span>bucket_idx<span class="token punctuation">]</span><span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            min_value<span class="token punctuation">:</span> min<span class="token punctuation">,</span>            max_value<span class="token punctuation">:</span> max<span class="token punctuation">,</span>            num_buckets<span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">find_bucket_index</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> min<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> width<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">usize</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> value<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>min<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket <span class="token operator">=</span> offset<span class="token punctuation">.</span><span class="token function">divide</span><span class="token punctuation">(</span>width<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">floor</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        bucket<span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span>num_buckets <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Pros:</strong> Simple, good for uniform data.</p><p><strong>Cons:</strong> Poor for skewed data (most buckets empty, one bucket huge).</p><hr /><h3 id="Equi-Depth-Histograms-PostgreSQL-Style">Equi-Depth Histograms (PostgreSQL Style)</h3><p><strong>Equal row counts per bucket, varying ranges:</strong></p><pre class="language-none"><code class="language-none">Data: [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Equi-Depth (3 buckets, ~3-4 rows each):┌─────────────────────────────────────────────────────────────┐│ Bucket 1: [1-4]     → 4 rows  │████████████████████████████││ Bucket 2: [5-11]    → 3 rows  │████████████████████████████││ Bucket 3: [12-14]   → 3 rows  │████████████████████████████│└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> total_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">DepthBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DepthBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> cumulative_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>  <span class="token comment">// Rows &lt;= upper_bound</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Sort values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted_values <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted_values<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> total_rows <span class="token operator">=</span> sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> target_bucket_size <span class="token operator">=</span> <span class="token punctuation">(</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> num_buckets <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> cumulative <span class="token operator">=</span> <span class="token number">0u64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> start <span class="token operator">=</span> i <span class="token operator">*</span> target_bucket_size<span class="token punctuation">;</span>            <span class="token keyword">let</span> end <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">*</span> target_bucket_size<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span>sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> start <span class="token operator">>=</span> sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> bucket_values <span class="token operator">=</span> <span class="token operator">&amp;</span>sorted_values<span class="token punctuation">[</span>start<span class="token punctuation">..</span>end<span class="token punctuation">]</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> row_count <span class="token operator">=</span> bucket_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> bucket_values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            cumulative <span class="token operator">+=</span> row_count<span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">DepthBucket</span> <span class="token punctuation">&#123;</span>                lower_bound<span class="token punctuation">:</span> bucket_values<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                upper_bound<span class="token punctuation">:</span> bucket_values<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                row_count<span class="token punctuation">,</span>                distinct_count<span class="token punctuation">,</span>                cumulative_count<span class="token punctuation">:</span> cumulative<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            num_buckets<span class="token punctuation">,</span>            total_rows<span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for equality predicate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_eq</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Assume uniform distribution within bucket</span>                <span class="token keyword">let</span> bucket_selectivity <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">/</span> bucket<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                <span class="token keyword">return</span> bucket_selectivity <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token number">0.0</span>  <span class="token comment">// Value outside histogram range</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value > X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_gt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">&lt;</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Entire bucket matches</span>                selectivity <span class="token operator">+=</span> bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Partial bucket - assume uniform within bucket</span>                <span class="token keyword">let</span> bucket_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> matching_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> fraction <span class="token operator">=</span> <span class="token punctuation">(</span>matching_range <span class="token operator">/</span> bucket_range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                selectivity <span class="token operator">+=</span> fraction <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// else: value >= bucket.upper_bound, no match</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value &lt; X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_lt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value >= X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_gte</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Entire bucket matches</span>                selectivity <span class="token operator">+=</span> bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">></span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Partial bucket</span>                <span class="token keyword">let</span> bucket_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> matching_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> fraction <span class="token operator">=</span> <span class="token punctuation">(</span>matching_range <span class="token operator">/</span> bucket_range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                selectivity <span class="token operator">+=</span> fraction <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Pros:</strong> Better for skewed data, each bucket has equal weight.</p><p><strong>Cons:</strong> Bucket boundaries can be arbitrary for data with many duplicates.</p><hr /><h3 id="Compressed-Histograms-Handling-Duplicates">Compressed Histograms (Handling Duplicates)</h3><p><strong>Separate handling for most common values (MCV):</strong></p><pre class="language-none"><code class="language-none">Data: [1, 1, 1, 1, 1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Compressed Histogram:┌─────────────────────────────────────────────────────────────┐│ Most Common Values (MCV):                                   ││   Value 1: frequency &#x3D; 5&#x2F;14 &#x3D; 35.7%                        │├─────────────────────────────────────────────────────────────┤│ Histogram (for remaining values):                           ││   Bucket 1: [2-5]   → 4 rows                                ││   Bucket 2: [6-10]  → 1 row                                 ││   Bucket 3: [11-14] → 4 rows                                │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> mcv_list<span class="token punctuation">:</span> <span class="token class-name">MCVList</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> histogram<span class="token punctuation">:</span> <span class="token class-name">EquiDepthHistogram</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_fraction<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MCVList</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (value, frequency)</span>    <span class="token keyword">pub</span> total_mcv_frequency<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> num_mcv<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Handle NULLs</span>        <span class="token keyword">let</span> null_count <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> null_fraction <span class="token operator">=</span> null_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_null_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build MCV list</span>        <span class="token keyword">let</span> mcv_list <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build_mcv_list</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_null_values<span class="token punctuation">,</span> num_mcv<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build histogram for non-MCV values</span>        <span class="token keyword">let</span> mcv_values<span class="token punctuation">:</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span>             mcv_list<span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> _<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_mcv_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> non_null_values            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>mcv_values<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span>v<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> histogram <span class="token operator">=</span> <span class="token keyword">if</span> non_mcv_values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// All values are in MCV, create empty histogram</span>            <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>                num_buckets<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                total_rows<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">EquiDepthHistogram</span><span class="token punctuation">::</span><span class="token function">build</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_mcv_values<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            mcv_list<span class="token punctuation">,</span>            histogram<span class="token punctuation">,</span>            null_fraction<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">build_mcv_list</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_mcv<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">MCVList</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> value_counts<span class="token punctuation">:</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">usize</span><span class="token operator">></span> <span class="token operator">=</span>             <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> values <span class="token punctuation">&#123;</span>            <span class="token operator">*</span>value_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> mcv<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> value_counts<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        mcv<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> b<span class="token number">.1</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>a<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Sort by count descending</span>        <span class="token keyword">let</span> total_mcv_frequency <span class="token operator">=</span> mcv<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span>c <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">f64</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mcv_values <span class="token operator">=</span> mcv            <span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">take</span><span class="token punctuation">(</span>num_mcv<span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">(</span>v<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> c <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">MCVList</span> <span class="token punctuation">&#123;</span>            values<span class="token punctuation">:</span> mcv_values<span class="token punctuation">,</span>            total_mcv_frequency<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity with MCV awareness</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> op<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check MCV first</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> value <span class="token punctuation">&#123;</span>                <span class="token keyword">return</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                    <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token operator">*</span>frequency<span class="token punctuation">,</span>                    <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Neq</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> frequency<span class="token punctuation">,</span>                    _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token comment">// For range ops, MCV exact match doesn't help much</span>                        <span class="token comment">// Fall through to histogram estimation</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Not in MCV, use histogram</span>        <span class="token keyword">let</span> histogram_selectivity <span class="token operator">=</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_eq</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_lt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Adjust for non-MCV portion</span>        <span class="token keyword">let</span> non_mcv_fraction <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>total_mcv_frequency<span class="token punctuation">;</span>        histogram_selectivity <span class="token operator">*</span> non_mcv_fraction    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Pros:</strong> Best of both worlds - accurate for common values, good distribution for rest.</p><p><strong>Cons:</strong> More complex, requires more memory.</p><hr /><h2 id="3-Sampling-for-Large-Tables">3 Sampling for Large Tables</h2><h3 id="The-Problem-Full-Table-Scans-Are-Expensive">The Problem: Full Table Scans Are Expensive</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (100 million rows)</span><span class="token comment">-- Building histogram requires sorting all values</span><span class="token comment">-- Full scan approach:</span><span class="token keyword">SELECT</span> balance <span class="token keyword">FROM</span> users <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> balance<span class="token punctuation">;</span><span class="token comment">-- Time: 30 minutes (!)</span><span class="token comment">-- I/O: Read entire table</span><span class="token comment">-- Not practical for routine ANALYZE</span></code></pre><h3 id="Solution-Statistical-Sampling">Solution: Statistical Sampling</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/sampling.rs</span><span class="token keyword">use</span> <span class="token namespace">rand<span class="token punctuation">::</span>seq<span class="token punctuation">::</span></span><span class="token class-name">SliceRandom</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">rand<span class="token punctuation">::</span></span>thread_rng<span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Sampler</span> <span class="token punctuation">&#123;</span>    sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    confidence<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Sampler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> confidence<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> sample_size<span class="token punctuation">,</span> confidence <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Reservoir sampling for streaming data</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">reservoir_sample</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token punctuation">:</span> <span class="token class-name">Clone</span><span class="token operator">></span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        stream<span class="token punctuation">:</span> <span class="token keyword">impl</span> <span class="token class-name">Iterator</span><span class="token operator">&lt;</span><span class="token class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">T</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> reservoir <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">with_capacity</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>sample_size<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> item <span class="token keyword">in</span> stream <span class="token punctuation">&#123;</span>            count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> reservoir<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>sample_size <span class="token punctuation">&#123;</span>                reservoir<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Replace with probability sample_size / count</span>                <span class="token keyword">let</span> j <span class="token operator">=</span> <span class="token function">thread_rng</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">gen_range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">..</span>count<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> j <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>sample_size <span class="token punctuation">&#123;</span>                    reservoir<span class="token punctuation">[</span>j<span class="token punctuation">]</span> <span class="token operator">=</span> item<span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        reservoir    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Simple random sampling for in-memory data</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">simple_random_sample</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token punctuation">:</span> <span class="token class-name">Clone</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">T</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> rng <span class="token operator">=</span> <span class="token function">thread_rng</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sample<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sample<span class="token punctuation">.</span><span class="token function">shuffle</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> rng<span class="token punctuation">)</span><span class="token punctuation">;</span>        sample<span class="token punctuation">.</span><span class="token function">truncate</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>sample_size<span class="token punctuation">)</span><span class="token punctuation">;</span>        sample    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Calculate confidence interval for estimate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">confidence_interval</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> sample_proportion<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span> sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// z-score for confidence level (1.96 for 95%)</span>        <span class="token keyword">let</span> z <span class="token operator">=</span> <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>confidence <span class="token punctuation">&#123;</span>            <span class="token number">0.99</span> <span class="token operator">=></span> <span class="token number">2.576</span><span class="token punctuation">,</span>            <span class="token number">0.95</span> <span class="token operator">=></span> <span class="token number">1.96</span><span class="token punctuation">,</span>            <span class="token number">0.90</span> <span class="token operator">=></span> <span class="token number">1.645</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">1.96</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> standard_error <span class="token operator">=</span> <span class="token punctuation">(</span>sample_proportion <span class="token operator">*</span> <span class="token punctuation">(</span><span class="token number">1.0</span> <span class="token operator">-</span> sample_proportion<span class="token punctuation">)</span> <span class="token operator">/</span> sample_size <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sqrt</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> margin <span class="token operator">=</span> z <span class="token operator">*</span> standard_error<span class="token punctuation">;</span>        <span class="token punctuation">(</span>sample_proportion <span class="token operator">-</span> margin<span class="token punctuation">,</span> sample_proportion <span class="token operator">+</span> margin<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage in histogram building</span><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build_from_sample</span><span class="token punctuation">(</span>        values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> sampler <span class="token operator">=</span> <span class="token class-name">Sampler</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>sample_size<span class="token punctuation">,</span> <span class="token number">0.95</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> sample <span class="token operator">=</span> sampler<span class="token punctuation">.</span><span class="token function">simple_random_sample</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build histogram from sample</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> histogram <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>sample<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Scale row counts to match full table</span>        <span class="token keyword">let</span> scale_factor <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> sample<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> histogram<span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            bucket<span class="token punctuation">.</span>row_count <span class="token operator">=</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> scale_factor<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">round</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">=</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>cumulative_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> scale_factor<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">round</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        histogram<span class="token punctuation">.</span>total_rows <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>histogram<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><h3 id="Sample-Size-Guidelines">Sample Size Guidelines</h3><table><thead><tr><th>Table Size</th><th>Recommended Sample</th><th>Accuracy (95% CI)</th></tr></thead><tbody><tr><td>&lt; 10K rows</td><td>100% (full scan)</td><td>Exact</td></tr><tr><td>10K-1M rows</td><td>10%</td><td>±1%</td></tr><tr><td>1M-100M rows</td><td>1%</td><td>±0.1%</td></tr><tr><td>&gt; 100M rows</td><td>0.1% or 100K rows</td><td>±0.03%</td></tr></tbody></table><hr /><h2 id="4-Multi-Column-Statistics">4 Multi-Column Statistics</h2><h3 id="The-Problem-Column-Correlation">The Problem: Column Correlation</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (1,000,000 rows)</span><span class="token comment">-- Columns: country, city</span><span class="token comment">-- Individual column statistics:</span><span class="token comment">-- country: 'US' = 50%, 'UK' = 30%, 'DE' = 20%</span><span class="token comment">-- city: 'London' = 5%, 'Berlin' = 10%, 'Munich' = 5%</span><span class="token comment">-- Query:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> country <span class="token operator">=</span> <span class="token string">'UK'</span> <span class="token operator">AND</span> city <span class="token operator">=</span> <span class="token string">'London'</span><span class="token comment">-- Optimizer (assuming independence):</span><span class="token comment">-- Selectivity = 0.30 × 0.05 = 0.015 = 1.5%</span><span class="token comment">-- Estimated rows: 15,000</span><span class="token comment">-- Reality (London only exists in UK):</span><span class="token comment">-- Actual selectivity = 5% (all London users are in UK)</span><span class="token comment">-- Actual rows: 50,000</span><span class="token comment">-- Error: 3.3x underestimate!</span></code></pre><h3 id="Multi-Column-Histograms">Multi-Column Histograms</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MultiColumnHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> total_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MultiColumnBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> bounds<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (min, max) for each column</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MultiColumnHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Each row is [col1_value, col2_value, ...]</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> data<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Use k-d tree style partitioning for multi-dimensional buckets</span>        <span class="token keyword">let</span> buckets <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build_kd_histogram</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            columns<span class="token punctuation">,</span>            num_buckets<span class="token punctuation">,</span>            total_rows<span class="token punctuation">:</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">build_kd_histogram</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Simplified: split on dimension with highest variance</span>        <span class="token keyword">let</span> num_dims <span class="token operator">=</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> target_bucket_size <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> num_buckets<span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> buckets<span class="token punctuation">,</span> target_bucket_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>buckets<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">partition_data</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>        depth<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        remaining_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        buckets<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span>        target_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> remaining_buckets <span class="token operator">==</span> <span class="token number">1</span> <span class="token operator">||</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;=</span> target_size <span class="token punctuation">&#123;</span>            <span class="token comment">// Create leaf bucket</span>            <span class="token keyword">let</span> bucket <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">create_bucket</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>bucket<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Split on dimension with highest range</span>        <span class="token keyword">let</span> dim <span class="token operator">=</span> depth <span class="token operator">%</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> a<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>b<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mid <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>left<span class="token punctuation">,</span> right<span class="token punctuation">)</span> <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">split_at</span><span class="token punctuation">(</span>mid<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> depth <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> remaining_buckets <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">,</span> buckets<span class="token punctuation">,</span> target_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>right<span class="token punctuation">,</span> depth <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> remaining_buckets <span class="token operator">-</span> remaining_buckets <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">,</span> buckets<span class="token punctuation">,</span> target_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_bucket</span><span class="token punctuation">(</span>data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> num_dims <span class="token operator">=</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> bounds <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> dim <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_dims <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>row<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">&amp;</span>row<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            values<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            bounds<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token punctuation">(</span>values<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> values<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">MultiColumnBucket</span> <span class="token punctuation">&#123;</span>            bounds<span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            distinct_count<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token keyword">usize</span><span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> bucket_selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_bucket_selectivity</span><span class="token punctuation">(</span>bucket<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>            selectivity <span class="token operator">+=</span> bucket_selectivity <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_bucket_selectivity</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        bucket<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">MultiColumnBucket</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token keyword">usize</span><span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> bucket_selectivity <span class="token operator">=</span> <span class="token number">1.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>col_idx<span class="token punctuation">,</span> op<span class="token punctuation">,</span> value<span class="token punctuation">)</span> <span class="token keyword">in</span> predicates <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> <span class="token operator">*</span>col_idx <span class="token operator">>=</span> bucket<span class="token punctuation">.</span>bounds<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">continue</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>bounds<span class="token punctuation">[</span><span class="token operator">*</span>col_idx<span class="token punctuation">]</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> col_selectivity <span class="token operator">=</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> value <span class="token operator">>=</span> min <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> max <span class="token punctuation">&#123;</span>                        <span class="token number">1.0</span> <span class="token operator">/</span> bucket<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> value <span class="token operator">&lt;</span> min <span class="token punctuation">&#123;</span>                        <span class="token number">1.0</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">>=</span> max <span class="token punctuation">&#123;</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">let</span> range <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>min<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> matching <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token punctuation">(</span>matching <span class="token operator">/</span> range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// ... handle other operators</span>                _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            bucket_selectivity <span class="token operator">*=</span> col_selectivity<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        bucket_selectivity    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>When to use multi-column stats:</strong></p><table><thead><tr><th>Scenario</th><th>Recommendation</th></tr></thead><tbody><tr><td>Columns are independent</td><td>Single-column histograms</td></tr><tr><td>Strong correlation (city → country)</td><td>Multi-column histogram</td></tr><tr><td>Frequently used together in WHERE</td><td>Multi-column histogram</td></tr><tr><td>High cardinality combination</td><td>May not be worth it</td></tr></tbody></table><hr /><h2 id="5-Using-Histograms-in-the-Optimizer">5 Using Histograms in the Optimizer</h2><h3 id="Integrating-with-Cost-Model">Integrating with Cost Model</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/histogram_selectivity.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">HistogramSelectivityEstimator</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">HistogramSelectivityEstimator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> statistics <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> table_stats <span class="token operator">=</span> <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=></span> stats<span class="token punctuation">,</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// No stats, use default</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> col_stats <span class="token operator">=</span> <span class="token keyword">match</span> table_stats<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>column<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=></span> stats<span class="token punctuation">,</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">match</span> <span class="token operator">&amp;</span>col_stats<span class="token punctuation">.</span>histogram <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">Compressed</span><span class="token punctuation">(</span>comp<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span>op<span class="token punctuation">,</span> right<span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">InList</span> <span class="token punctuation">&#123;</span> list<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token comment">// Sum selectivity for each value</span>                        list<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span><span class="token punctuation">)</span>                            <span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Between</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">,</span> high<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">let</span> low_sel <span class="token operator">=</span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span><span class="token punctuation">,</span> low<span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> high_sel <span class="token operator">=</span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span><span class="token punctuation">,</span> high<span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token punctuation">(</span>low_sel <span class="token operator">+</span> high_sel <span class="token operator">-</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">EquiDepth</span><span class="token punctuation">(</span>equi<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_eq</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_lt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> equi<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// No histogram, fall back to MCV or default</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_from_mcv</span><span class="token punctuation">(</span>col_stats<span class="token punctuation">,</span> expr<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_from_mcv</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> col_stats<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">ColumnStatistics</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Check if value is in MCV list</span>                <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span>col_stats<span class="token punctuation">.</span>most_common_values <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> right <span class="token punctuation">&#123;</span>                        <span class="token keyword">return</span> <span class="token operator">*</span>frequency<span class="token punctuation">;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Not in MCV, estimate based on distinct count</span>                <span class="token number">1.0</span> <span class="token operator">/</span> col_stats<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// Default for other operators</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Updated cost model using histograms</span><span class="token keyword">impl</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity_with_histograms</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        estimator<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">HistogramSelectivityEstimator</span><span class="token punctuation">,</span>        table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>        expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Find which side is a column reference</span>                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">)</span> <span class="token operator">=</span> left<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">return</span> estimator<span class="token punctuation">.</span><span class="token function">estimate</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token operator">&amp;</span>ident<span class="token punctuation">.</span>value<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                        op<span class="token punctuation">:</span> op<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        right<span class="token punctuation">:</span> right<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Default estimate</span>                <span class="token number">0.5</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">InList</span> <span class="token punctuation">&#123;</span> expr<span class="token punctuation">,</span> list<span class="token punctuation">,</span> negated <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> base_selectivity <span class="token operator">=</span> list<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> estimator<span class="token punctuation">.</span><span class="token function">estimate</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token string">"column"</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">f64</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span><span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token operator">*</span>negated <span class="token punctuation">&#123;</span> <span class="token number">1.0</span> <span class="token operator">-</span> base_selectivity <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span> base_selectivity <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Example-Real-Query-Optimization">Example: Real Query Optimization</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table statistics (from ANALYZE):</span><span class="token comment">-- users: 1,000,000 rows</span><span class="token comment">-- users.balance: histogram shows 80% have balance &lt; $100</span><span class="token comment">-- Query:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> balance <span class="token operator">></span> <span class="token number">100</span><span class="token comment">-- Without histogram:</span><span class="token comment">-- Selectivity estimate: 0.33 (fixed guess)</span><span class="token comment">-- Estimated rows: 333,333</span><span class="token comment">-- Chosen plan: SeqScan (cost: 5000)</span><span class="token comment">-- With histogram:</span><span class="token comment">-- Selectivity estimate: 0.20 (from histogram)</span><span class="token comment">-- Estimated rows: 200,000</span><span class="token comment">-- Chosen plan: IndexScan (cost: 2000) ← Better!</span></code></pre><hr /><h2 id="6-Histogram-Maintenance">6 Histogram Maintenance</h2><h3 id="When-to-Update-Statistics">When to Update Statistics</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/stats_maintenance.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatsMaintenancePolicy</span> <span class="token punctuation">&#123;</span>    auto_analyze_threshold<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>  <span class="token comment">// Rows modified before auto-analyze</span>    auto_analyze_scale_factor<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// Fraction of table</span>    last_analyze_threshold<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Duration</span><span class="token punctuation">,</span>  <span class="token comment">// Max time since last analyze</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Default</span> <span class="token keyword">for</span> <span class="token class-name">StatsMaintenancePolicy</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            auto_analyze_threshold<span class="token punctuation">:</span> <span class="token number">50</span><span class="token punctuation">,</span>  <span class="token comment">// PostgreSQL default</span>            auto_analyze_scale_factor<span class="token punctuation">:</span> <span class="token number">0.2</span><span class="token punctuation">,</span>  <span class="token comment">// 20% of table</span>            last_analyze_threshold<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Duration</span><span class="token punctuation">::</span><span class="token function">days</span><span class="token punctuation">(</span><span class="token number">7</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatsMaintenanceManager</span> <span class="token punctuation">&#123;</span>    catalog<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Catalog</span><span class="token operator">></span><span class="token punctuation">,</span>    policy<span class="token punctuation">:</span> <span class="token class-name">StatsMaintenancePolicy</span><span class="token punctuation">,</span>    modification_counts<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token keyword">u64</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// table → rows modified</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatsMaintenanceManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">check_and_analyze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">bool</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> table_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">get_table_statistics</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> modified_rows <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">copied</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> needs_analyze <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=</span> table_stats <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> threshold <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>policy<span class="token punctuation">.</span>auto_analyze_threshold                <span class="token operator">+</span> <span class="token punctuation">(</span>stats<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>policy<span class="token punctuation">.</span>auto_analyze_scale_factor<span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            modified_rows <span class="token operator">>=</span> threshold        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token boolean">true</span>  <span class="token comment">// No stats, need initial analyze</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> needs_analyze <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token boolean">false</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">record_modification</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> rows_affected<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> count <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">*</span>count <span class="token operator">+=</span> rows_affected<span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Run ANALYZE on the table</span>        <span class="token keyword">let</span> analyzer <span class="token operator">=</span> <span class="token class-name">StatisticsAnalyzer</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> stats <span class="token operator">=</span> analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">store_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stats<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Incremental-Histogram-Updates">Incremental Histogram Updates</h3><p><strong>Problem:</strong> Full rebuild is expensive for large tables.</p><p><strong>Solution:</strong> Incremental updates for small changes.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Update histogram with new values (for small changes)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> new_values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> deleted_values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Update total row count</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token operator">+=</span> new_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token operator">-=</span> deleted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token comment">// For each new value, find its bucket and increment count</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> new_values <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                    bucket<span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                    bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                    <span class="token keyword">break</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// For deleted values, decrement counts</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> deleted_values <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                    bucket<span class="token punctuation">.</span>row_count <span class="token operator">=</span> bucket<span class="token punctuation">.</span>row_count<span class="token punctuation">.</span><span class="token function">saturating_sub</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">=</span> bucket<span class="token punctuation">.</span>cumulative_count<span class="token punctuation">.</span><span class="token function">saturating_sub</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">break</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// If too many changes, trigger full rebuild</span>        <span class="token keyword">let</span> total_changes <span class="token operator">=</span> new_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> deleted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> total_changes <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">></span> <span class="token number">0.1</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Mark for full rebuild</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>needs_rebuild <span class="token operator">=</span> <span class="token boolean">true</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-Challenges-Building-in-Rust">7 Challenges Building in Rust</h2><h3 id="Challenge-1-Value-Comparison">Challenge 1: Value Comparison</h3><p><strong>Problem:</strong> SQL values can be different types (int, float, string, date).</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work - can't compare different types</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Integer</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Float</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">String</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Date</span><span class="token punctuation">(</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">NaiveDate</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Ord</span> <span class="token keyword">for</span> <span class="token class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">Self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Ordering</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Can't compare Integer with String!</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Type-aware comparison</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - handle type mismatches</span><span class="token keyword">impl</span> <span class="token class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">compare</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">Self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Ordering</span><span class="token punctuation">,</span> <span class="token class-name">ValueError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> a<span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">(</span><span class="token operator">*</span>a <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> a<span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token operator">*</span>b <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">TypeMismatch</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-2-Memory-Usage">Challenge 2: Memory Usage</h3><p><strong>Problem:</strong> Histograms for many columns can use significant memory.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Memory explosion</span><span class="token comment">// 1000 tables × 20 columns × 100 buckets × 32 bytes = 64 MB</span></code></pre><p><strong>Solution: Compress bucket bounds</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Compressed representation</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound_idx<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>  <span class="token comment">// Index into shared value pool</span>    <span class="token keyword">pub</span> upper_bound_idx<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    shared_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Deduplicated values</span>    buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">CompressedBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-3-Thread-Safe-Statistics">Challenge 3: Thread-Safe Statistics</h3><p><strong>Problem:</strong> Statistics need to be read during query planning while being updated by ANALYZE.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    tables<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">TableStatistics</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Arc&lt;RwLock&lt;&gt;&gt; for concurrent access</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Thread-safe</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    tables<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">RwLock</span><span class="token operator">&lt;</span><span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token operator">>></span><span class="token operator">>></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">get_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token operator">>></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> tables <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tables<span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        tables<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> stats<span class="token punctuation">:</span> <span class="token class-name">TableStatistics</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> tables <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tables<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        tables<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>stats<span class="token punctuation">.</span>table_name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Arc</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-How-AI-Accelerated-This">8 How AI Accelerated This</h2><h3 id="What-AI-Got-Right">What AI Got Right</h3><table><thead><tr><th>Task</th><th>AI Contribution</th></tr></thead><tbody><tr><td><strong>Histogram types</strong></td><td>Explained equi-width vs. equi-depth trade-offs</td></tr><tr><td><strong>MCV handling</strong></td><td>Suggested separate MCV list for common values</td></tr><tr><td><strong>Sampling</strong></td><td>Reservoir sampling algorithm for streaming</td></tr><tr><td><strong>Selectivity formulas</strong></td><td>Correct range estimation within buckets</td></tr></tbody></table><hr /><h3 id="What-AI-Got-Wrong">What AI Got Wrong</h3><table><thead><tr><th>Issue</th><th>What Happened</th></tr></thead><tbody><tr><td><strong>Bucket boundaries</strong></td><td>First draft had off-by-one errors in boundary calculation</td></tr><tr><td><strong>Multi-column correlation</strong></td><td>Initially assumed independence (wrong for correlated columns)</td></tr><tr><td><strong>NULL handling</strong></td><td>Forgot to track NULL fraction separately</td></tr><tr><td><strong>Incremental updates</strong></td><td>Suggested full rebuild for any change (too expensive)</td></tr></tbody></table><p><strong>Pattern:</strong> AI handles textbook cases well. Edge cases (boundaries, NULLs, incremental) need manual refinement.</p><hr /><h3 id="Example-Debugging-Selectivity">Example: Debugging Selectivity</h3><p><strong>My question to AI:</strong></p><blockquote><p>“Histogram shows 20% of rows have balance &gt; 100, but optimizer estimates 5%. Why?”</p></blockquote><p><strong>What I learned:</strong></p><ol><li>MCV list wasn’t being checked first</li><li>Non-MCV fraction adjustment was missing</li><li>Range estimation within bucket was inverted</li></ol><p><strong>Result:</strong> Fixed selectivity estimation:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> op<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Check MCV first</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> value <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token operator">*</span>frequency<span class="token punctuation">,</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Neq</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> frequency<span class="token punctuation">,</span>                _ <span class="token operator">=></span> <span class="token number">0.0</span>  <span class="token comment">// Fall through for range ops</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Use histogram for non-MCV values</span>    <span class="token keyword">let</span> histogram_sel <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> non_mcv_fraction <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>total_mcv_frequency<span class="token punctuation">;</span>    histogram_sel <span class="token operator">*</span> non_mcv_fraction  <span class="token comment">// ← Was missing this!</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="Summary-Histogram-Statistics-in-One-Diagram">Summary: Histogram Statistics in One Diagram</h2><pre class="language-MERMAID_BASE64_596" data-language="MERMAID_BASE64_596"><code class="language-MERMAID_BASE64_596">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiRGF0YSBDb2xsZWN0aW9uIgogICAgICAgIEFbVGFibGUgRGF0YV0gLS0+IEJ7U2FtcGxlIG9yIEZ1bGw&#x2F;fQogICAgICAgIEIgLS0+fFNtYWxsIHRhYmxlfCBDW0Z1bGwgU2Nhbl0KICAgICAgICBCIC0tPnxMYXJnZSB0YWJsZXwgRFtSZXNlcnZvaXIgU2FtcGxpbmddCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIkhpc3RvZ3JhbSBCdWlsZGluZyIKICAgICAgICBDIC0tPiBFW1NvcnQgVmFsdWVzXQogICAgICAgIEQgLS0+IEUKICAgICAgICBFIC0tPiBGe0J1aWxkIFR5cGU&#x2F;fQogICAgICAgIEYgLS0+fFVuaWZvcm0gZGF0YXwgR1tFcXVpLVdpZHRoXQogICAgICAgIEYgLS0+fFNrZXdlZCBkYXRhfCBIW0VxdWktRGVwdGhdCiAgICAgICAgRiAtLT58V2l0aCBkdXBsaWNhdGVzfCBJW0NvbXByZXNzZWQgKyBNQ1ZdCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIk11bHRpLUNvbHVtbiIKICAgICAgICBKW0NvcnJlbGF0ZWQgQ29sdW1uc10gLS0+IEtbTXVsdGktRGltZW5zaW9uYWwgQnVja2V0c10KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiVXNhZ2UgaW4gT3B0aW1pemVyIgogICAgICAgIEggLS0+IExbU2VsZWN0aXZpdHkgRXN0aW1hdGlvbl0KICAgICAgICBJIC0tPiBMCiAgICAgICAgSyAtLT4gTAogICAgICAgIEwgLS0+IE1bQ29zdCBDYWxjdWxhdGlvbl0KICAgICAgICBNIC0tPiBOW1BsYW4gU2VsZWN0aW9uXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJNYWludGVuYW5jZSIKICAgICAgICBPW0lOU0VSVC9VUERBVEUvREVMRVRFXSAtLT4gUFtUcmFjayBNb2RpZmljYXRpb25zXQogICAgICAgIFAgLS0+IFF7VGhyZXNob2xkIFJlYWNoZWQ&#x2F;fQogICAgICAgIFEgLS0+fFllc3wgUltBdXRvLUFOQUxZWkVdCiAgICAgICAgUSAtLT58Tm98IFNbSW5jcmVtZW50YWwgVXBkYXRlXQogICAgICAgIFIgLS0+IEUKICAgIGVuZAogICAgCiAgICBzdHlsZSBBIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgTiBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEwgZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAw</code></pre><p><strong>Key Takeaways:</strong></p><table><thead><tr><th>Concept</th><th>Why It Matters</th></tr></thead><tbody><tr><td><strong>Equi-depth histograms</strong></td><td>Better for skewed data than equi-width</td></tr><tr><td><strong>MCV lists</strong></td><td>Accurate for common values</td></tr><tr><td><strong>Sampling</strong></td><td>Makes ANALYZE practical for large tables</td></tr><tr><td><strong>Multi-column stats</strong></td><td>Captures column correlation</td></tr><tr><td><strong>Incremental updates</strong></td><td>Avoids full rebuild for small changes</td></tr><tr><td><strong>Thread-safe access</strong></td><td>Allows concurrent reads during updates</td></tr></tbody></table><hr /><p><strong>Further Reading:</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/commands/analyze.c"><code>src/backend/commands/analyze.c</code></a></li><li>“Random Sampling for Histogram Construction” by Vitter et al.</li><li>“Selectivity Estimation for Range Queries” by Ioannidis et al.</li><li>PostgreSQL Documentation: “Statistics Used by the Planner”</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Part 8 of the Vaultgres journey: deep dive into histogram statistics. Building equi-depth histograms, handling skewed data, multi-column statistics, and using histograms for accurate selectivity estimation in cost-based optimization.</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 建構 PostgreSQL 相容資料庫：用於準確選擇性估計的直方圖統計</title>
    <link href="https://neo01.com/zh-TW/2026/03/Database-Rust-Histogram-Statistics/"/>
    <id>https://neo01.com/zh-TW/2026/03/Database-Rust-Histogram-Statistics/</id>
    <published>2026-03-07T16:00:00.000Z</published>
    <updated>2026-03-14T03:49:31.707Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-TW/2026/03/Database-Rust-Query-Optimizer/">第七部分</a> 中，我們建構了一個基於成本的最佳化器。但有個問題。</p><p><strong>我們的選擇性估計是猜測：</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// From Part 7 - simplified (wrong!) estimates</span><span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>        <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token number">0.01</span><span class="token punctuation">,</span>   <span class="token comment">// Always 1%? Wrong!</span>        <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token number">0.33</span><span class="token punctuation">,</span>   <span class="token comment">// Always 33%? Wrong!</span>        _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                      <span class="token comment">// Always 50%? Very wrong!</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>真實資料不是均勻的：</strong></p><pre class="language-none"><code class="language-none">User balances in our database:Balance Distribution:$0-100:     ████████████████████████████████████████  80% of users$100-1K:    ████████                                   15% of users$1K-10K:    ██                                          4% of users$10K-100K:  █                                           1% of usersQuery: SELECT * FROM users WHERE balance &gt; 100Our estimate: 33% of rows (using fixed 0.33)Actual: 20% of rows→ Wrong plan chosen! Index scan would be better than seq scan.</code></pre><p><strong>解決方案：</strong> 捕捉實際資料分佈的直方圖。</p><p>今天：在 Rust 中實作直方圖統計以進行準確的選擇性估計。</p><hr /><h2 id="1-為什麼直方圖很重要">1 為什麼直方圖很重要</h2><h3 id="均勻分佈謬誤">均勻分佈謬誤</h3><p><strong>沒有直方圖，最佳化器假設均勻分佈：</strong></p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (1,000,000 rows)</span><span class="token comment">-- Column: account_type ('free', 'premium', 'enterprise')</span><span class="token comment">-- Reality:</span><span class="token string">'free'</span>:       <span class="token number">950</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">95</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token string">'premium'</span>:     <span class="token number">45</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">4.5</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token string">'enterprise'</span>:   <span class="token number">5</span><span class="token punctuation">,</span><span class="token number">000</span> <span class="token keyword">rows</span> <span class="token punctuation">(</span><span class="token number">0.5</span><span class="token operator">%</span><span class="token punctuation">)</span><span class="token comment">-- Query 1:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> account_type <span class="token operator">=</span> <span class="token string">'enterprise'</span><span class="token comment">-- Optimizer estimate (uniform): 1,000,000 / 3 = 333,333 rows</span><span class="token comment">-- Actual: 5,000 rows</span><span class="token comment">-- Error: 67x overestimate!</span><span class="token comment">-- Query 2:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> account_type <span class="token operator">=</span> <span class="token string">'free'</span><span class="token comment">-- Optimizer estimate (uniform): 333,333 rows</span><span class="token comment">-- Actual: 950,000 rows</span><span class="token comment">-- Error: 3x underestimate!</span></code></pre><p><strong>有直方圖：</strong> 我們知道實際分佈。</p><hr /><h3 id="對連線順序的影響">對連線順序的影響</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> u<span class="token punctuation">.</span>name<span class="token punctuation">,</span> o<span class="token punctuation">.</span>total<span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>account_type <span class="token operator">=</span> <span class="token string">'enterprise'</span></code></pre><p><strong>沒有直方圖：</strong></p><pre class="language-none"><code class="language-none">Optimizer thinks &#39;enterprise&#39; &#x3D; 333,333 rowsPlan: SeqScan(users) → Hash Join → SeqScan(orders)Cost: 5000 (wrong!)</code></pre><p><strong>有直方圖：</strong></p><pre class="language-none"><code class="language-none">Histogram shows &#39;enterprise&#39; &#x3D; 5,000 rowsPlan: IndexScan(users) → Nested Loop → IndexScan(orders)Cost: 100 (correct!)</code></pre><p><strong>結果：</strong> 查詢快 50 倍。</p><hr /><h2 id="2-直方圖類型">2 直方圖類型</h2><h3 id="等寬直方圖">等寬直方圖</h3><p><strong>相等的桶範圍，不同的列數：</strong></p><pre class="language-none"><code class="language-none">Data: [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Equi-Width (3 buckets, range 1-14):┌─────────────────────────────────────────────────────────────┐│ Bucket 1: [1-5]     → 5 rows  │████████████████████████████││ Bucket 2: [6-10]    → 1 row   │██████                      ││ Bucket 3: [11-14]   → 4 rows  │██████████████████████      │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">EquiWidthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> min_value<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_value<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">WidthBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WidthBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">EquiWidthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">find_min_max</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket_width <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>min<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">divide</span><span class="token punctuation">(</span>num_buckets <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">WidthBucket</span> <span class="token punctuation">&#123;</span>            lower_bound<span class="token punctuation">:</span> min<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            upper_bound<span class="token punctuation">:</span> min<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            distinct_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            null_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span> num_buckets<span class="token punctuation">]</span><span class="token punctuation">;</span>        <span class="token comment">// Initialize bucket bounds</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_buckets <span class="token punctuation">&#123;</span>            buckets<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>lower_bound <span class="token operator">=</span> min<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>bucket_width<span class="token punctuation">.</span><span class="token function">multiply</span><span class="token punctuation">(</span>i <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">.</span>upper_bound <span class="token operator">=</span> min<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>bucket_width<span class="token punctuation">.</span><span class="token function">multiply</span><span class="token punctuation">(</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Count rows in each bucket</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> values <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                buckets<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>null_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">continue</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> bucket_idx <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">find_bucket_index</span><span class="token punctuation">(</span>value<span class="token punctuation">,</span> <span class="token operator">&amp;</span>min<span class="token punctuation">,</span> <span class="token operator">&amp;</span>bucket_width<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> bucket_idx <span class="token operator">&lt;</span> num_buckets <span class="token punctuation">&#123;</span>                buckets<span class="token punctuation">[</span>bucket_idx<span class="token punctuation">]</span><span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            min_value<span class="token punctuation">:</span> min<span class="token punctuation">,</span>            max_value<span class="token punctuation">:</span> max<span class="token punctuation">,</span>            num_buckets<span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">find_bucket_index</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> min<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> width<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">usize</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> value<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>min<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket <span class="token operator">=</span> offset<span class="token punctuation">.</span><span class="token function">divide</span><span class="token punctuation">(</span>width<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">floor</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        bucket<span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span>num_buckets <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>優點：</strong> 簡單，適合均勻資料。</p><p><strong>缺點：</strong> 對於偏斜資料效果差（大多數桶為空，一個桶巨大）。</p><hr /><h3 id="等深直方圖（PostgreSQL-風格）">等深直方圖（PostgreSQL 風格）</h3><p><strong>每個桶的列數相等，範圍不同：</strong></p><pre class="language-none"><code class="language-none">Data: [1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Equi-Depth (3 buckets, ~3-4 rows each):┌─────────────────────────────────────────────────────────────┐│ Bucket 1: [1-4]     → 4 rows  │████████████████████████████││ Bucket 2: [5-11]    → 3 rows  │████████████████████████████││ Bucket 3: [12-14]   → 3 rows  │████████████████████████████│└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> total_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">DepthBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DepthBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> cumulative_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>  <span class="token comment">// Rows &lt;= upper_bound</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Sort values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted_values <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted_values<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> total_rows <span class="token operator">=</span> sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> target_bucket_size <span class="token operator">=</span> <span class="token punctuation">(</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> num_buckets <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> cumulative <span class="token operator">=</span> <span class="token number">0u64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> start <span class="token operator">=</span> i <span class="token operator">*</span> target_bucket_size<span class="token punctuation">;</span>            <span class="token keyword">let</span> end <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">*</span> target_bucket_size<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span>sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> start <span class="token operator">>=</span> sorted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> bucket_values <span class="token operator">=</span> <span class="token operator">&amp;</span>sorted_values<span class="token punctuation">[</span>start<span class="token punctuation">..</span>end<span class="token punctuation">]</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> row_count <span class="token operator">=</span> bucket_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> bucket_values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            cumulative <span class="token operator">+=</span> row_count<span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">DepthBucket</span> <span class="token punctuation">&#123;</span>                lower_bound<span class="token punctuation">:</span> bucket_values<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                upper_bound<span class="token punctuation">:</span> bucket_values<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                row_count<span class="token punctuation">,</span>                distinct_count<span class="token punctuation">,</span>                cumulative_count<span class="token punctuation">:</span> cumulative<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            num_buckets<span class="token punctuation">,</span>            total_rows<span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for equality predicate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_eq</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Assume uniform distribution within bucket</span>                <span class="token keyword">let</span> bucket_selectivity <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">/</span> bucket<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                <span class="token keyword">return</span> bucket_selectivity <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token number">0.0</span>  <span class="token comment">// Value outside histogram range</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value > X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_gt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">&lt;</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Entire bucket matches</span>                selectivity <span class="token operator">+=</span> bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Partial bucket - assume uniform within bucket</span>                <span class="token keyword">let</span> bucket_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> matching_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> fraction <span class="token operator">=</span> <span class="token punctuation">(</span>matching_range <span class="token operator">/</span> bucket_range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                selectivity <span class="token operator">+=</span> fraction <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// else: value >= bucket.upper_bound, no match</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value &lt; X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_lt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity for range predicate (value >= X)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_gte</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Entire bucket matches</span>                selectivity <span class="token operator">+=</span> bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">></span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                <span class="token comment">// Partial bucket</span>                <span class="token keyword">let</span> bucket_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> matching_range <span class="token operator">=</span> bucket<span class="token punctuation">.</span>upper_bound<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> fraction <span class="token operator">=</span> <span class="token punctuation">(</span>matching_range <span class="token operator">/</span> bucket_range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                selectivity <span class="token operator">+=</span> fraction <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>優點：</strong> 對於偏斜資料更好，每個桶有相等的權重。</p><p><strong>缺點：</strong> 對於有大量重複的資料，桶邊界可能是任意的。</p><hr /><h3 id="壓縮直方圖（處理重複）">壓縮直方圖（處理重複）</h3><p><strong>單獨處理最常见值（MCV）：</strong></p><pre class="language-none"><code class="language-none">Data: [1, 1, 1, 1, 1, 2, 3, 4, 5, 10, 11, 12, 13, 14]Compressed Histogram:┌─────────────────────────────────────────────────────────────┐│ Most Common Values (MCV):                                   ││   Value 1: frequency &#x3D; 5&#x2F;14 &#x3D; 35.7%                        │├─────────────────────────────────────────────────────────────┤│ Histogram (for remaining values):                           ││   Bucket 1: [2-5]   → 4 rows                                ││   Bucket 2: [6-10]  → 1 row                                 ││   Bucket 3: [11-14] → 4 rows                                │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> mcv_list<span class="token punctuation">:</span> <span class="token class-name">MCVList</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> histogram<span class="token punctuation">:</span> <span class="token class-name">EquiDepthHistogram</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_fraction<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MCVList</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (value, frequency)</span>    <span class="token keyword">pub</span> total_mcv_frequency<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> num_mcv<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Handle NULLs</span>        <span class="token keyword">let</span> null_count <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> null_fraction <span class="token operator">=</span> null_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_null_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build MCV list</span>        <span class="token keyword">let</span> mcv_list <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build_mcv_list</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_null_values<span class="token punctuation">,</span> num_mcv<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build histogram for non-MCV values</span>        <span class="token keyword">let</span> mcv_values<span class="token punctuation">:</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span>             mcv_list<span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> _<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_mcv_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> non_null_values            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>mcv_values<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span>v<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> histogram <span class="token operator">=</span> <span class="token keyword">if</span> non_mcv_values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// All values are in MCV, create empty histogram</span>            <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>                num_buckets<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                total_rows<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">EquiDepthHistogram</span><span class="token punctuation">::</span><span class="token function">build</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_mcv_values<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            mcv_list<span class="token punctuation">,</span>            histogram<span class="token punctuation">,</span>            null_fraction<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">build_mcv_list</span><span class="token punctuation">(</span>values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> num_mcv<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">MCVList</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> value_counts<span class="token punctuation">:</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">usize</span><span class="token operator">></span> <span class="token operator">=</span>             <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> values <span class="token punctuation">&#123;</span>            <span class="token operator">*</span>value_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> mcv<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> value_counts<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        mcv<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> b<span class="token number">.1</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>a<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Sort by count descending</span>        <span class="token keyword">let</span> total_mcv_frequency <span class="token operator">=</span> mcv<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span>c <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">f64</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mcv_values <span class="token operator">=</span> mcv            <span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">take</span><span class="token punctuation">(</span>num_mcv<span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">(</span>v<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> c <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">MCVList</span> <span class="token punctuation">&#123;</span>            values<span class="token punctuation">:</span> mcv_values<span class="token punctuation">,</span>            total_mcv_frequency<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Estimate selectivity with MCV awareness</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> op<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check MCV first</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> value <span class="token punctuation">&#123;</span>                <span class="token keyword">return</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                    <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token operator">*</span>frequency<span class="token punctuation">,</span>                    <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Neq</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> frequency<span class="token punctuation">,</span>                    _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token comment">// For range ops, MCV exact match doesn't help much</span>                        <span class="token comment">// Fall through to histogram estimation</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Not in MCV, use histogram</span>        <span class="token keyword">let</span> histogram_selectivity <span class="token operator">=</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_eq</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_lt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Adjust for non-MCV portion</span>        <span class="token keyword">let</span> non_mcv_fraction <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>total_mcv_frequency<span class="token punctuation">;</span>        histogram_selectivity <span class="token operator">*</span> non_mcv_fraction    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>優點：</strong> 兩全其美 - 對於常见值準確，對於其餘值有良好的分佈。</p><p><strong>缺點：</strong> 更複雜，需要更多記憶體。</p><hr /><h2 id="3-大型表的取樣">3 大型表的取樣</h2><h3 id="問題：完整表掃描很昂貴">問題：完整表掃描很昂貴</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (100 million rows)</span><span class="token comment">-- Building histogram requires sorting all values</span><span class="token comment">-- Full scan approach:</span><span class="token keyword">SELECT</span> balance <span class="token keyword">FROM</span> users <span class="token keyword">ORDER</span> <span class="token keyword">BY</span> balance<span class="token punctuation">;</span><span class="token comment">-- Time: 30 minutes (!)</span><span class="token comment">-- I/O: Read entire table</span><span class="token comment">-- Not practical for routine ANALYZE</span></code></pre><h3 id="解決方案：統計取樣">解決方案：統計取樣</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/sampling.rs</span><span class="token keyword">use</span> <span class="token namespace">rand<span class="token punctuation">::</span>seq<span class="token punctuation">::</span></span><span class="token class-name">SliceRandom</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">rand<span class="token punctuation">::</span></span>thread_rng<span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Sampler</span> <span class="token punctuation">&#123;</span>    sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    confidence<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Sampler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> confidence<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> sample_size<span class="token punctuation">,</span> confidence <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Reservoir sampling for streaming data</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">reservoir_sample</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token punctuation">:</span> <span class="token class-name">Clone</span><span class="token operator">></span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        stream<span class="token punctuation">:</span> <span class="token keyword">impl</span> <span class="token class-name">Iterator</span><span class="token operator">&lt;</span><span class="token class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">T</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> reservoir <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">with_capacity</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>sample_size<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> count <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> item <span class="token keyword">in</span> stream <span class="token punctuation">&#123;</span>            count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> reservoir<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>sample_size <span class="token punctuation">&#123;</span>                reservoir<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>item<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Replace with probability sample_size / count</span>                <span class="token keyword">let</span> j <span class="token operator">=</span> <span class="token function">thread_rng</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">gen_range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">..</span>count<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> j <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>sample_size <span class="token punctuation">&#123;</span>                    reservoir<span class="token punctuation">[</span>j<span class="token punctuation">]</span> <span class="token operator">=</span> item<span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        reservoir    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Simple random sampling for in-memory data</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">simple_random_sample</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token punctuation">:</span> <span class="token class-name">Clone</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">T</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> rng <span class="token operator">=</span> <span class="token function">thread_rng</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sample<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">T</span><span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sample<span class="token punctuation">.</span><span class="token function">shuffle</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> rng<span class="token punctuation">)</span><span class="token punctuation">;</span>        sample<span class="token punctuation">.</span><span class="token function">truncate</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>sample_size<span class="token punctuation">)</span><span class="token punctuation">;</span>        sample    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Calculate confidence interval for estimate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">confidence_interval</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> sample_proportion<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span> sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// z-score for confidence level (1.96 for 95%)</span>        <span class="token keyword">let</span> z <span class="token operator">=</span> <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>confidence <span class="token punctuation">&#123;</span>            <span class="token number">0.99</span> <span class="token operator">=></span> <span class="token number">2.576</span><span class="token punctuation">,</span>            <span class="token number">0.95</span> <span class="token operator">=></span> <span class="token number">1.96</span><span class="token punctuation">,</span>            <span class="token number">0.90</span> <span class="token operator">=></span> <span class="token number">1.645</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">1.96</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> standard_error <span class="token operator">=</span> <span class="token punctuation">(</span>sample_proportion <span class="token operator">*</span> <span class="token punctuation">(</span><span class="token number">1.0</span> <span class="token operator">-</span> sample_proportion<span class="token punctuation">)</span> <span class="token operator">/</span> sample_size <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sqrt</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> margin <span class="token operator">=</span> z <span class="token operator">*</span> standard_error<span class="token punctuation">;</span>        <span class="token punctuation">(</span>sample_proportion <span class="token operator">-</span> margin<span class="token punctuation">,</span> sample_proportion <span class="token operator">+</span> margin<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage in histogram building</span><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build_from_sample</span><span class="token punctuation">(</span>        values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        sample_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> sampler <span class="token operator">=</span> <span class="token class-name">Sampler</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>sample_size<span class="token punctuation">,</span> <span class="token number">0.95</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> sample <span class="token operator">=</span> sampler<span class="token punctuation">.</span><span class="token function">simple_random_sample</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Build histogram from sample</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> histogram <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>sample<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Scale row counts to match full table</span>        <span class="token keyword">let</span> scale_factor <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> sample<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> histogram<span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            bucket<span class="token punctuation">.</span>row_count <span class="token operator">=</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> scale_factor<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">round</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">=</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>cumulative_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> scale_factor<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">round</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        histogram<span class="token punctuation">.</span>total_rows <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>histogram<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><h3 id="取樣大小指南">取樣大小指南</h3><table><thead><tr><th>表大小</th><th>建議取樣</th><th>準確度 (95% CI)</th></tr></thead><tbody><tr><td>&lt; 10K rows</td><td>100% (full scan)</td><td>Exact</td></tr><tr><td>10K-1M rows</td><td>10%</td><td>±1%</td></tr><tr><td>1M-100M rows</td><td>1%</td><td>±0.1%</td></tr><tr><td>&gt; 100M rows</td><td>0.1% or 100K rows</td><td>±0.03%</td></tr></tbody></table><hr /><h2 id="4-多欄位統計">4 多欄位統計</h2><h3 id="問題：欄位相關性">問題：欄位相關性</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table: users (1,000,000 rows)</span><span class="token comment">-- Columns: country, city</span><span class="token comment">-- Individual column statistics:</span><span class="token comment">-- country: 'US' = 50%, 'UK' = 30%, 'DE' = 20%</span><span class="token comment">-- city: 'London' = 5%, 'Berlin' = 10%, 'Munich' = 5%</span><span class="token comment">-- Query:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> country <span class="token operator">=</span> <span class="token string">'UK'</span> <span class="token operator">AND</span> city <span class="token operator">=</span> <span class="token string">'London'</span><span class="token comment">-- Optimizer (assuming independence):</span><span class="token comment">-- Selectivity = 0.30 × 0.05 = 0.015 = 1.5%</span><span class="token comment">-- Estimated rows: 15,000</span><span class="token comment">-- Reality (London only exists in UK):</span><span class="token comment">-- Actual selectivity = 5% (all London users are in UK)</span><span class="token comment">-- Actual rows: 50,000</span><span class="token comment">-- Error: 3.3x underestimate!</span></code></pre><h3 id="多欄位直方圖">多欄位直方圖</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MultiColumnHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> total_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MultiColumnBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> bounds<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (min, max) for each column</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MultiColumnHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">build</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Each row is [col1_value, col2_value, ...]</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> data<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">HistogramError</span><span class="token punctuation">::</span><span class="token class-name">EmptyData</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Use k-d tree style partitioning for multi-dimensional buckets</span>        <span class="token keyword">let</span> buckets <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">build_kd_histogram</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> num_buckets<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            columns<span class="token punctuation">,</span>            num_buckets<span class="token punctuation">,</span>            total_rows<span class="token punctuation">:</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            buckets<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">build_kd_histogram</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>        num_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Simplified: split on dimension with highest variance</span>        <span class="token keyword">let</span> num_dims <span class="token operator">=</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> target_bucket_size <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> num_buckets<span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">,</span> num_buckets<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> buckets<span class="token punctuation">,</span> target_bucket_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>buckets<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">partition_data</span><span class="token punctuation">(</span>        data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">,</span>        depth<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        remaining_buckets<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        buckets<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token operator">></span><span class="token punctuation">,</span>        target_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> remaining_buckets <span class="token operator">==</span> <span class="token number">1</span> <span class="token operator">||</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;=</span> target_size <span class="token punctuation">&#123;</span>            <span class="token comment">// Create leaf bucket</span>            <span class="token keyword">let</span> bucket <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">create_bucket</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>bucket<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Split on dimension with highest range</span>        <span class="token keyword">let</span> dim <span class="token operator">=</span> depth <span class="token operator">%</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> a<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>b<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mid <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>left<span class="token punctuation">,</span> right<span class="token punctuation">)</span> <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">split_at</span><span class="token punctuation">(</span>mid<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> depth <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> remaining_buckets <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">,</span> buckets<span class="token punctuation">,</span> target_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">partition_data</span><span class="token punctuation">(</span>right<span class="token punctuation">,</span> depth <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> remaining_buckets <span class="token operator">-</span> remaining_buckets <span class="token operator">/</span> <span class="token number">2</span><span class="token punctuation">,</span> buckets<span class="token punctuation">,</span> target_size<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_bucket</span><span class="token punctuation">(</span>data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">MultiColumnBucket</span><span class="token punctuation">,</span> <span class="token class-name">HistogramError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> num_dims <span class="token operator">=</span> data<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> bounds <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> dim <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_dims <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>row<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">&amp;</span>row<span class="token punctuation">[</span>dim<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            values<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            bounds<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token punctuation">(</span>values<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> values<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> data<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">MultiColumnBucket</span> <span class="token punctuation">&#123;</span>            bounds<span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            distinct_count<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token keyword">usize</span><span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> selectivity <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> bucket_selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_bucket_selectivity</span><span class="token punctuation">(</span>bucket<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>            selectivity <span class="token operator">+=</span> bucket_selectivity <span class="token operator">*</span> <span class="token punctuation">(</span>bucket<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        selectivity    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_bucket_selectivity</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        bucket<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">MultiColumnBucket</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token keyword">usize</span><span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> bucket_selectivity <span class="token operator">=</span> <span class="token number">1.0</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>col_idx<span class="token punctuation">,</span> op<span class="token punctuation">,</span> value<span class="token punctuation">)</span> <span class="token keyword">in</span> predicates <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> <span class="token operator">*</span>col_idx <span class="token operator">>=</span> bucket<span class="token punctuation">.</span>bounds<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">continue</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>bounds<span class="token punctuation">[</span><span class="token operator">*</span>col_idx<span class="token punctuation">]</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> col_selectivity <span class="token operator">=</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> value <span class="token operator">>=</span> min <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> max <span class="token punctuation">&#123;</span>                        <span class="token number">1.0</span> <span class="token operator">/</span> bucket<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> value <span class="token operator">&lt;</span> min <span class="token punctuation">&#123;</span>                        <span class="token number">1.0</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> value <span class="token operator">>=</span> max <span class="token punctuation">&#123;</span>                        <span class="token number">0.0</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">let</span> range <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>min<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> matching <span class="token operator">=</span> max<span class="token punctuation">.</span><span class="token function">subtract</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_f64</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token punctuation">(</span>matching <span class="token operator">/</span> range<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clamp</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">,</span> <span class="token number">1.0</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// ... handle other operators</span>                _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            bucket_selectivity <span class="token operator">*=</span> col_selectivity<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        bucket_selectivity    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>何時使用多欄位統計：</strong></p><table><thead><tr><th>情境</th><th>建議</th></tr></thead><tbody><tr><td>Columns are independent</td><td>Single-column histograms</td></tr><tr><td>Strong correlation (city → country)</td><td>Multi-column histogram</td></tr><tr><td>Frequently used together in WHERE</td><td>Multi-column histogram</td></tr><tr><td>High cardinality combination</td><td>May not be worth it</td></tr></tbody></table><hr /><h2 id="5-在最佳化器中使用直方圖">5 在最佳化器中使用直方圖</h2><h3 id="與成本模型整合">與成本模型整合</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/histogram_selectivity.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">HistogramSelectivityEstimator</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">HistogramSelectivityEstimator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> statistics <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> table_stats <span class="token operator">=</span> <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=></span> stats<span class="token punctuation">,</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// No stats, use default</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> col_stats <span class="token operator">=</span> <span class="token keyword">match</span> table_stats<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>column<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=></span> stats<span class="token punctuation">,</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">match</span> <span class="token operator">&amp;</span>col_stats<span class="token punctuation">.</span>histogram <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">Compressed</span><span class="token punctuation">(</span>comp<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span>op<span class="token punctuation">,</span> right<span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">InList</span> <span class="token punctuation">&#123;</span> list<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token comment">// Sum selectivity for each value</span>                        list<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span><span class="token punctuation">)</span>                            <span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Between</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">,</span> high<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">let</span> low_sel <span class="token operator">=</span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span><span class="token punctuation">,</span> low<span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> high_sel <span class="token operator">=</span> comp<span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span><span class="token punctuation">,</span> high<span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token punctuation">(</span>low_sel <span class="token operator">+</span> high_sel <span class="token operator">-</span> <span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">0.0</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">EquiDepth</span><span class="token punctuation">(</span>equi<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>                    <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_eq</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_lt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> equi<span class="token punctuation">.</span><span class="token function">estimate_gte</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> equi<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// No histogram, fall back to MCV or default</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_from_mcv</span><span class="token punctuation">(</span>col_stats<span class="token punctuation">,</span> expr<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_from_mcv</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> col_stats<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">ColumnStatistics</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Check if value is in MCV list</span>                <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span>col_stats<span class="token punctuation">.</span>most_common_values <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> right <span class="token punctuation">&#123;</span>                        <span class="token keyword">return</span> <span class="token operator">*</span>frequency<span class="token punctuation">;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Not in MCV, estimate based on distinct count</span>                <span class="token number">1.0</span> <span class="token operator">/</span> col_stats<span class="token punctuation">.</span>distinct_count<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// Default for other operators</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Updated cost model using histograms</span><span class="token keyword">impl</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity_with_histograms</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        estimator<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">HistogramSelectivityEstimator</span><span class="token punctuation">,</span>        table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>        expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Find which side is a column reference</span>                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">)</span> <span class="token operator">=</span> left<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">return</span> estimator<span class="token punctuation">.</span><span class="token function">estimate</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token operator">&amp;</span>ident<span class="token punctuation">.</span>value<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                        op<span class="token punctuation">:</span> op<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        right<span class="token punctuation">:</span> right<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Default estimate</span>                <span class="token number">0.5</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">InList</span> <span class="token punctuation">&#123;</span> expr<span class="token punctuation">,</span> list<span class="token punctuation">,</span> negated <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> base_selectivity <span class="token operator">=</span> list<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> estimator<span class="token punctuation">.</span><span class="token function">estimate</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token string">"column"</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">f64</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span><span class="token number">1.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token operator">*</span>negated <span class="token punctuation">&#123;</span> <span class="token number">1.0</span> <span class="token operator">-</span> base_selectivity <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span> base_selectivity <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="範例：真實查詢最佳化">範例：真實查詢最佳化</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Table statistics (from ANALYZE):</span><span class="token comment">-- users: 1,000,000 rows</span><span class="token comment">-- users.balance: histogram shows 80% have balance &lt; $100</span><span class="token comment">-- Query:</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> balance <span class="token operator">></span> <span class="token number">100</span><span class="token comment">-- Without histogram:</span><span class="token comment">-- Selectivity estimate: 0.33 (fixed guess)</span><span class="token comment">-- Estimated rows: 333,333</span><span class="token comment">-- Chosen plan: SeqScan (cost: 5000)</span><span class="token comment">-- With histogram:</span><span class="token comment">-- Selectivity estimate: 0.20 (from histogram)</span><span class="token comment">-- Estimated rows: 200,000</span><span class="token comment">-- Chosen plan: IndexScan (cost: 2000) ← Better!</span></code></pre><hr /><h2 id="6-直方圖維護">6 直方圖維護</h2><h3 id="何時更新統計">何時更新統計</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/stats_maintenance.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatsMaintenancePolicy</span> <span class="token punctuation">&#123;</span>    auto_analyze_threshold<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>  <span class="token comment">// Rows modified before auto-analyze</span>    auto_analyze_scale_factor<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// Fraction of table</span>    last_analyze_threshold<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Duration</span><span class="token punctuation">,</span>  <span class="token comment">// Max time since last analyze</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Default</span> <span class="token keyword">for</span> <span class="token class-name">StatsMaintenancePolicy</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            auto_analyze_threshold<span class="token punctuation">:</span> <span class="token number">50</span><span class="token punctuation">,</span>  <span class="token comment">// PostgreSQL default</span>            auto_analyze_scale_factor<span class="token punctuation">:</span> <span class="token number">0.2</span><span class="token punctuation">,</span>  <span class="token comment">// 20% of table</span>            last_analyze_threshold<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Duration</span><span class="token punctuation">::</span><span class="token function">days</span><span class="token punctuation">(</span><span class="token number">7</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatsMaintenanceManager</span> <span class="token punctuation">&#123;</span>    catalog<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Catalog</span><span class="token operator">></span><span class="token punctuation">,</span>    policy<span class="token punctuation">:</span> <span class="token class-name">StatsMaintenancePolicy</span><span class="token punctuation">,</span>    modification_counts<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token keyword">u64</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// table → rows modified</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatsMaintenanceManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">check_and_analyze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">bool</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> table_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">get_table_statistics</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> modified_rows <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">copied</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> needs_analyze <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span> <span class="token operator">=</span> table_stats <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> threshold <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>policy<span class="token punctuation">.</span>auto_analyze_threshold                <span class="token operator">+</span> <span class="token punctuation">(</span>stats<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>policy<span class="token punctuation">.</span>auto_analyze_scale_factor<span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>            modified_rows <span class="token operator">>=</span> threshold        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token boolean">true</span>  <span class="token comment">// No stats, need initial analyze</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> needs_analyze <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token boolean">false</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">record_modification</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> rows_affected<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> count <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>modification_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">*</span>count <span class="token operator">+=</span> rows_affected<span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Run ANALYZE on the table</span>        <span class="token keyword">let</span> analyzer <span class="token operator">=</span> <span class="token class-name">StatisticsAnalyzer</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> stats <span class="token operator">=</span> analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">store_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stats<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="增量直方圖更新">增量直方圖更新</h3><p><strong>問題：</strong> 對於大型表，完整重建很昂貴。</p><p><strong>解決方案：</strong> 對於小變更進行增量更新。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">EquiDepthHistogram</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Update histogram with new values (for small changes)</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> new_values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> deleted_values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Update total row count</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token operator">+=</span> new_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token operator">-=</span> deleted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token comment">// For each new value, find its bucket and increment count</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> new_values <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                    bucket<span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                    bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                    <span class="token keyword">break</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// For deleted values, decrement counts</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> deleted_values <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> bucket <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buckets <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> value <span class="token operator">>=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>lower_bound <span class="token operator">&amp;&amp;</span> value <span class="token operator">&lt;=</span> <span class="token operator">&amp;</span>bucket<span class="token punctuation">.</span>upper_bound <span class="token punctuation">&#123;</span>                    bucket<span class="token punctuation">.</span>row_count <span class="token operator">=</span> bucket<span class="token punctuation">.</span>row_count<span class="token punctuation">.</span><span class="token function">saturating_sub</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    bucket<span class="token punctuation">.</span>cumulative_count <span class="token operator">=</span> bucket<span class="token punctuation">.</span>cumulative_count<span class="token punctuation">.</span><span class="token function">saturating_sub</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">break</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// If too many changes, trigger full rebuild</span>        <span class="token keyword">let</span> total_changes <span class="token operator">=</span> new_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> deleted_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> total_changes <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> <span class="token keyword">self</span><span class="token punctuation">.</span>total_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">></span> <span class="token number">0.1</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Mark for full rebuild</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>needs_rebuild <span class="token operator">=</span> <span class="token boolean">true</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-建構的挑戰">7 用 Rust 建構的挑戰</h2><h3 id="挑戰-1：值比較">挑戰 1：值比較</h3><p><strong>問題：</strong> SQL 值可以是不同類型（int、float、string、date）。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work - can't compare different types</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Integer</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Float</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">String</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Date</span><span class="token punctuation">(</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">NaiveDate</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Ord</span> <span class="token keyword">for</span> <span class="token class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">Self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Ordering</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Can't compare Integer with String!</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：類型感知比較</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - handle type mismatches</span><span class="token keyword">impl</span> <span class="token class-name">Value</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">compare</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">Self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Ordering</span><span class="token punctuation">,</span> <span class="token class-name">ValueError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">,</span> other<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> a<span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>a<span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">(</span><span class="token operator">*</span>a <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>a<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>b<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> a<span class="token punctuation">.</span><span class="token function">partial_cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token operator">*</span>b <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok_or</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">NaN</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ValueError</span><span class="token punctuation">::</span><span class="token class-name">TypeMismatch</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-2：記憶體使用">挑戰 2：記憶體使用</h3><p><strong>問題：</strong> 許多欄位的直方圖可能使用大量記憶體。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Memory explosion</span><span class="token comment">// 1000 tables × 20 columns × 100 buckets × 32 bytes = 64 MB</span></code></pre><p><strong>解決方案：壓縮桶邊界</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Compressed representation</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedBucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound_idx<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>  <span class="token comment">// Index into shared value pool</span>    <span class="token keyword">pub</span> upper_bound_idx<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CompressedHistogram</span> <span class="token punctuation">&#123;</span>    shared_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Deduplicated values</span>    buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">CompressedBucket</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-3：執行緒安全統計">挑戰 3：執行緒安全統計</h3><p><strong>問題：</strong> 統計需要在 ANALYZE 更新期間在查詢計劃期間讀取。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    tables<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">TableStatistics</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：Arc&lt;RwLock&lt;&gt;&gt; 用於併發存取</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Thread-safe</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    tables<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">RwLock</span><span class="token operator">&lt;</span><span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token operator">>></span><span class="token operator">>></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatisticsCatalog</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">get_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token operator">>></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> tables <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tables<span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        tables<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> stats<span class="token punctuation">:</span> <span class="token class-name">TableStatistics</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> tables <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tables<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        tables<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>stats<span class="token punctuation">.</span>table_name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Arc</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-AI-如何加速這項工作">8 AI 如何加速這項工作</h2><h3 id="AI-做對了什麼">AI 做對了什麼</h3><table><thead><tr><th>任務</th><th>AI 貢獻</th></tr></thead><tbody><tr><td><strong>直方圖類型</strong></td><td>解釋等寬 vs. 等深權衡</td></tr><tr><td><strong>MCV 處理</strong></td><td>建議單獨 MCV 串列用於常见值</td></tr><tr><td><strong>取樣</strong></td><td>用於串流的蓄水池取樣演算法</td></tr><tr><td><strong>選擇性公式</strong></td><td>桶內範圍估計的正確公式</td></tr></tbody></table><hr /><h3 id="AI-做錯了什麼">AI 做錯了什麼</h3><table><thead><tr><th>問題</th><th>發生什麼事</th></tr></thead><tbody><tr><td><strong>桶邊界</strong></td><td>初稿在邊界計算中有差一錯誤</td></tr><tr><td><strong>多欄位相關性</strong></td><td>最初假設獨立（對於相關欄位錯誤）</td></tr><tr><td><strong>NULL 處理</strong></td><td>忘記單獨追蹤 NULL 分數</td></tr><tr><td><strong>增量更新</strong></td><td>建議對任何變更進行完整重建（太昂貴）</td></tr></tbody></table><p><strong>模式：</strong> AI 處理教科書案例良好。邊界情況（邊界、NULL、增量）需要手動精煉。</p><hr /><h3 id="範例：除錯選擇性">範例：除錯選擇性</h3><p><strong>我問 AI 的問題：</strong></p><blockquote><p>“直方圖顯示 20% 的列有 balance &gt; 100，但最佳化器估計 5%。為什麼？”</p></blockquote><p><strong>我學到的：</strong></p><ol><li>MCV 串列沒有被首先檢查</li><li>缺少非 MCV 分數調整</li><li>桶內範圍估計顛倒了</li></ol><p><strong>結果：</strong> 修復選擇性估計：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> op<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Check MCV first</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>mcv_val<span class="token punctuation">,</span> frequency<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> mcv_val <span class="token operator">==</span> value <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token operator">*</span>frequency<span class="token punctuation">,</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Neq</span> <span class="token operator">=></span> <span class="token number">1.0</span> <span class="token operator">-</span> frequency<span class="token punctuation">,</span>                _ <span class="token operator">=></span> <span class="token number">0.0</span>  <span class="token comment">// Fall through for range ops</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Use histogram for non-MCV values</span>    <span class="token keyword">let</span> histogram_sel <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>histogram<span class="token punctuation">.</span><span class="token function">estimate_gt</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> non_mcv_fraction <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">-</span> <span class="token keyword">self</span><span class="token punctuation">.</span>mcv_list<span class="token punctuation">.</span>total_mcv_frequency<span class="token punctuation">;</span>    histogram_sel <span class="token operator">*</span> non_mcv_fraction  <span class="token comment">// ← Was missing this!</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="總結：直方圖統計一張圖">總結：直方圖統計一張圖</h2><pre class="language-MERMAID_BASE64_629" data-language="MERMAID_BASE64_629"><code class="language-MERMAID_BASE64_629">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiRGF0YSBDb2xsZWN0aW9uIgogICAgICAgIEFbVGFibGUgRGF0YV0gLS0+IEJ7U2FtcGxlIG9yIEZ1bGw&#x2F;fQogICAgICAgIEIgLS0+fFNtYWxsIHRhYmxlfCBDW0Z1bGwgU2Nhbl0KICAgICAgICBCIC0tPnxMYXJnZSB0YWJsZXwgRFtSZXNlcnZvaXIgU2FtcGxpbmddCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIkhpc3RvZ3JhbSBCdWlsZGluZyIKICAgICAgICBDIC0tPiBFW1NvcnQgVmFsdWVzXQogICAgICAgIEQgLS0+IEUKICAgICAgICBFIC0tPiBGe0J1aWxkIFR5cGU&#x2F;fQogICAgICAgIEYgLS0+fFVuaWZvcm0gZGF0YXwgR1tFcXVpLVdpZHRoXQogICAgICAgIEYgLS0+fFNrZXdlZCBkYXRhfCBIW0VxdWktRGVwdGhdCiAgICAgICAgRiAtLT58V2l0aCBkdXBsaWNhdGVzfCBJW0NvbXByZXNzZWQgKyBNQ1ZdCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIk11bHRpLUNvbHVtbiIKICAgICAgICBKW0NvcnJlbGF0ZWQgQ29sdW1uc10gLS0+IEtbTXVsdGktRGltZW5zaW9uYWwgQnVja2V0c10KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiVXNhZ2UgaW4gT3B0aW1pemVyIgogICAgICAgIEggLS0+IExbU2VsZWN0aXZpdHkgRXN0aW1hdGlvbl0KICAgICAgICBJIC0tPiBMCiAgICAgICAgSyAtLT4gTAogICAgICAgIEwgLS0+IE1bQ29zdCBDYWxjdWxhdGlvbl0KICAgICAgICBNIC0tPiBOW1BsYW4gU2VsZWN0aW9uXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJNYWludGVuYW5jZSIKICAgICAgICBPW0lOU0VSVC9VUERBVEUvREVMRVRFXSAtLT4gUFtUcmFjayBNb2RpZmljYXRpb25zXQogICAgICAgIFAgLS0+IFF7VGhyZXNob2xkIFJlYWNoZWQ&#x2F;fQogICAgICAgIFEgLS0+fFllc3wgUltBdXRvLUFOQUxZWkVdCiAgICAgICAgUSAtLT58Tm98IFNbSW5jcmVtZW50YWwgVXBkYXRlXQogICAgICAgIFIgLS0+IEUKICAgIGVuZAogICAgCiAgICBzdHlsZSBBIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgTiBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEwgZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAw</code></pre><p><strong>關鍵要點：</strong></p><table><thead><tr><th>概念</th><th>為什麼重要</th></tr></thead><tbody><tr><td><strong>等深直方圖</strong></td><td>對於偏斜資料比等寬更好</td></tr><tr><td><strong>MCV 串列</strong></td><td>對於常见值準確</td></tr><tr><td><strong>取樣</strong></td><td>使 ANALYZE 對於大型表實用</td></tr><tr><td><strong>多欄位統計</strong></td><td>捕捉欄位相關性</td></tr><tr><td><strong>增量更新</strong></td><td>避免對小變更進行完整重建</td></tr><tr><td><strong>執行緒安全存取</strong></td><td>允許更新期間併發讀取</td></tr></tbody></table><hr /><p><strong>進一步閱讀：</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/commands/analyze.c"><code>src/backend/commands/analyze.c</code></a></li><li>“Random Sampling for Histogram Construction” by Vitter et al.</li><li>“Selectivity Estimation for Range Queries” by Ioannidis et al.</li><li>PostgreSQL Documentation: “Statistics Used by the Planner”</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第八部分：深入探討直方圖統計。建構等深直方圖、處理偏斜資料、多欄位統計，以及在基於成本的最佳化中使用直方圖進行準確的選擇性估計。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 构建 PostgreSQL 兼容数据库：带统计数据的基于成本的查询优化器</title>
    <link href="https://neo01.com/zh-CN/2026/03/Database-Rust-Query-Optimizer/"/>
    <id>https://neo01.com/zh-CN/2026/03/Database-Rust-Query-Optimizer/</id>
    <published>2026-03-06T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:24.591Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-CN/2026/03/Database-Rust-SQL-Parser/">第六部分</a> 中，我们构建了一个产生 AST 的 SQL 解析器。但有个问题。</p><p><strong>相同的查询可以用许多不同的方式执行：</strong></p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> u<span class="token punctuation">.</span>name<span class="token punctuation">,</span> o<span class="token punctuation">.</span>total<span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>balance <span class="token operator">></span> <span class="token number">100</span></code></pre><p><strong>可能的执行计划：</strong></p><pre class="language-none"><code class="language-none">Plan A:                          Plan B:                          Plan C:1. Scan users                    1. Scan orders                   1. Scan users (balance &gt; 100)2. Filter (balance &gt; 100)        2. Filter (exists in users)      2. Index lookup on orders3. Scan orders                   3. Scan users                    3. Hash join4. Hash join                     4. Nested loop join              4. Sort by name5. Sort                          5. SortCost: 1500                       Cost: 800                        Cost: 200 ← 最佳！</code></pre><p><strong>我们如何自动找到计划 C？</strong></p><p>今天：在 Rust 中构建基于成本的查询优化器——带统计数据、成本模型和用于连接顺序的动态规划。</p><hr /><h2 id="1-逻辑-vs-物理计划">1 逻辑 vs. 物理计划</h2><h3 id="两阶段方法">两阶段方法</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│              Query Optimization Pipeline                     │├─────────────────────────────────────────────────────────────┤│                                                              ││  SQL AST                                                     ││     │                                                        ││     ▼                                                        ││  ┌──────────────────────────────────────────────────────┐   ││  │  Logical Plan (WHAT to compute)                      │   ││  │  - Logical Scan: users                               │   ││  │  - Logical Filter: balance &gt; 100                     │   ││  │  - Logical Hash Join: u.id &#x3D; o.user_id               │   ││  └──────────────────────────────────────────────────────┘   ││     │                                                        ││     ▼ Optimization                                           ││                                                              ││  ┌──────────────────────────────────────────────────────┐   ││  │  Physical Plan (HOW to compute)                      │   ││  │  - Index Scan: users (balance &gt; 100)                 │   ││  │  - Index Scan: orders (user_id index)                │   ││  │  - Nested Loop Join                                  │   ││  └──────────────────────────────────────────────────────┘   ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="逻辑计划算子">逻辑计划算子</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/logical_plan.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">LogicalPlan</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Scan a table</span>    <span class="token class-name">TableScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        projection<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Column indices</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Filter rows</span>    <span class="token class-name">Filter</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        predicate<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Project columns/expressions</span>    <span class="token class-name">Projection</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        expressions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Join two relations</span>    <span class="token class-name">Join</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">JoinCondition</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Aggregate (GROUP BY)</span>    <span class="token class-name">Aggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Sort (ORDER BY)</span>    <span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SortKey</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Limit</span>    <span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        limit<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        offset<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Distinct</span>    <span class="token class-name">Distinct</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinCondition</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">On</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Using</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>    <span class="token class-name">RightOuter</span><span class="token punctuation">,</span>    <span class="token class-name">FullOuter</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SortKey</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> expression<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> ascending<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> nulls_first<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AggregateFunction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> func<span class="token punctuation">:</span> <span class="token class-name">AggregateFunc</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> argument<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">AggregateFunc</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Count</span><span class="token punctuation">,</span>    <span class="token class-name">Sum</span><span class="token punctuation">,</span>    <span class="token class-name">Avg</span><span class="token punctuation">,</span>    <span class="token class-name">Min</span><span class="token punctuation">,</span>    <span class="token class-name">Max</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="物理计划算子">物理计划算子</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/physical_plan.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Full table scan</span>    <span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Index scan</span>    <span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">IndexCondition</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Nested loop join</span>    <span class="token class-name">NestedLoopJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Hash join</span>    <span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Merge join (requires sorted input)</span>    <span class="token class-name">MergeJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Sort</span>    <span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SortKey</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Hash aggregate</span>    <span class="token class-name">HashAggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Stream aggregate (requires sorted input)</span>    <span class="token class-name">StreamAggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Limit</span>    <span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        limit<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        offset<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">IndexCondition</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Eq</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span> high<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">InList</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="2-统计数据收集">2 统计数据收集</h2><h3 id="为什么统计数据很重要">为什么统计数据很重要</h3><p><strong>没有统计数据：</strong> 所有计划看起来都一样。</p><p><strong>有统计数据：</strong> 我们可以准确估计成本。</p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Query</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> balance <span class="token operator">></span> <span class="token number">100</span><span class="token comment">-- Scenario A: balance is uniformly distributed 0-1000</span><span class="token comment">-- → ~90% of rows match → SeqScan is better</span><span class="token comment">-- Scenario B: balance is skewed, only 1% have > 100</span><span class="token comment">-- → IndexScan is better</span></code></pre><hr /><h3 id="统计数据结构">统计数据结构</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/statistics.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> page_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> average_row_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">ColumnStatistics</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> indexes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">IndexStatistics</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> last_analyzed<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> column_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_fraction<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Fraction of NULL values (0.0 - 1.0)</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>     <span class="token comment">// Number of distinct values</span>    <span class="token keyword">pub</span> most_common_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (value, frequency)</span>    <span class="token keyword">pub</span> histogram<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Histogram</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> min_value<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_value<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Histogram</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Equi-width histogram (equal bucket sizes)</span>    <span class="token class-name">EquiWidth</span> <span class="token punctuation">&#123;</span>        buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Bucket</span><span class="token operator">></span><span class="token punctuation">,</span>        min<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>        max<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">/// Equi-depth histogram (equal rows per bucket)</span>    <span class="token class-name">EquiDepth</span> <span class="token punctuation">&#123;</span>        buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Bucket</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Bucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> index_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> is_unique<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> is_primary<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> leaf_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_keys<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> average_leaf_per_key<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="收集统计数据">收集统计数据</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/analyzer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsAnalyzer</span> <span class="token punctuation">&#123;</span>    buffer_pool<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">BufferPool</span><span class="token operator">></span><span class="token punctuation">,</span>    storage<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StorageEngine</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatisticsAnalyzer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> stats <span class="token operator">=</span> <span class="token class-name">TableStatistics</span> <span class="token punctuation">&#123;</span>            table_name<span class="token punctuation">:</span> table_name<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            page_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            average_row_size<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            columns<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            indexes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            last_analyzed<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token punctuation">::</span><span class="token function">now</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Scan all pages to collect statistics</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> total_size <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> column_values<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">>></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>storage<span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_name<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>page_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token keyword">for</span> row <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                stats<span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                total_size <span class="token operator">+=</span> row<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Collect column values</span>                <span class="token keyword">for</span> <span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> value<span class="token punctuation">)</span> <span class="token keyword">in</span> row<span class="token punctuation">.</span><span class="token function">columns</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    column_values                        <span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>col_name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">.</span><span class="token function">or_insert_with</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token punctuation">::</span>new<span class="token punctuation">)</span>                        <span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">if</span> stats<span class="token punctuation">.</span>row_count <span class="token operator">></span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>            stats<span class="token punctuation">.</span>average_row_size <span class="token operator">=</span> total_size <span class="token operator">/</span> stats<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Compute column statistics</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> values<span class="token punctuation">)</span> <span class="token keyword">in</span> column_values <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> col_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">compute_column_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>col_name<span class="token punctuation">,</span> <span class="token operator">&amp;</span>values<span class="token punctuation">)</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> col_stats<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Collect index statistics</span>        <span class="token keyword">for</span> index <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>storage<span class="token punctuation">.</span><span class="token function">get_table_indexes</span><span class="token punctuation">(</span>table_name<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> index_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">analyze_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>indexes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>index_stats<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Store statistics in system catalog</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">store_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stats<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">compute_column_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> col_name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> null_count <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> null_fraction <span class="token operator">=</span> null_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_null_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> non_null_values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token comment">// Compute most common values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> value_counts<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">usize</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span>non_null_values <span class="token punctuation">&#123;</span>            <span class="token operator">*</span>value_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> most_common<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> value_counts<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        most_common<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> b<span class="token number">.1</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>a<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mcv<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span> <span class="token operator">=</span> most_common            <span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">take</span><span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span>  <span class="token comment">// Keep top 10</span>            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">(</span>v<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> c <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> non_null_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Compute histogram</span>        <span class="token keyword">let</span> histogram <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">compute_histogram</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_null_values<span class="token punctuation">,</span> distinct_count<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Min/Max</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">if</span> non_null_values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">,</span> <span class="token class-name">None</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> sorted <span class="token operator">=</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> non_null_values<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            sorted<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>sorted<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>sorted<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>            column_name<span class="token punctuation">:</span> col_name<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            null_fraction<span class="token punctuation">,</span>            distinct_count<span class="token punctuation">,</span>            most_common_values<span class="token punctuation">:</span> mcv<span class="token punctuation">,</span>            histogram<span class="token punctuation">,</span>            min_value<span class="token punctuation">:</span> min<span class="token punctuation">,</span>            max_value<span class="token punctuation">:</span> max<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">compute_histogram</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Histogram</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">const</span> <span class="token constant">NUM_BUCKETS</span><span class="token punctuation">:</span> <span class="token keyword">usize</span> <span class="token operator">=</span> <span class="token number">100</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Use equi-depth histogram for better selectivity estimation</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket_size <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> <span class="token constant">NUM_BUCKETS</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> bucket_size <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span><span class="token constant">NUM_BUCKETS</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> start <span class="token operator">=</span> i <span class="token operator">*</span> bucket_size<span class="token punctuation">;</span>            <span class="token keyword">let</span> end <span class="token operator">=</span> <span class="token keyword">if</span> i <span class="token operator">==</span> <span class="token constant">NUM_BUCKETS</span> <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">&#123;</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">*</span> bucket_size <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Bucket</span> <span class="token punctuation">&#123;</span>                lower_bound<span class="token punctuation">:</span> sorted<span class="token punctuation">[</span>start<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                upper_bound<span class="token punctuation">:</span> sorted<span class="token punctuation">[</span>end <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                row_count<span class="token punctuation">:</span> <span class="token punctuation">(</span>end <span class="token operator">-</span> start<span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>                distinct_count<span class="token punctuation">:</span> distinct_count <span class="token operator">/</span> <span class="token constant">NUM_BUCKETS</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">EquiDepth</span> <span class="token punctuation">&#123;</span> buckets <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="使用统计数据：ANALYZE-命令">使用统计数据：ANALYZE 命令</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Analyze a single table</span><span class="token keyword">ANALYZE</span> users<span class="token punctuation">;</span><span class="token comment">-- Analyze specific columns</span><span class="token keyword">ANALYZE</span> users <span class="token punctuation">(</span>id<span class="token punctuation">,</span> balance<span class="token punctuation">,</span> created_at<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token comment">-- Analyze all tables</span><span class="token keyword">ANALYZE</span><span class="token punctuation">;</span><span class="token comment">-- Configure sampling (for large tables)</span><span class="token keyword">ANALYZE</span> users <span class="token keyword">WITH</span> SAMPLE <span class="token number">0.1</span><span class="token punctuation">;</span>  <span class="token comment">-- 10% sample</span></code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs (extended)</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Statement</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// ... existing statements ...</span>    <span class="token class-name">Analyze</span><span class="token punctuation">(</span><span class="token class-name">AnalyzeStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AnalyzeStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">ObjectName</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// src/optimizer/analyzer.rs</span><span class="token keyword">impl</span> <span class="token class-name">Database</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> statement<span class="token punctuation">:</span> <span class="token class-name">AnalyzeStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span> <span class="token operator">=</span> statement<span class="token punctuation">.</span>table <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Analyzed table &#123;&#125;: &#123;&#125; rows, &#123;&#125; pages"</span><span class="token punctuation">,</span>                      table<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>row_count<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>page_count<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Analyze all tables</span>            <span class="token keyword">for</span> table <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">get_all_tables</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Analyzed table &#123;&#125;: &#123;&#125; rows"</span><span class="token punctuation">,</span> table<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>row_count<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="3-成本模型">3 成本模型</h2><h3 id="成本公式">成本公式</h3><pre class="language-none"><code class="language-none">Total Cost &#x3D; CPU Cost + I&#x2F;O Cost + Memory CostWhere:- CPU Cost: Operations per row × number of rows- I&#x2F;O Cost: Pages read&#x2F;written × page cost- Memory Cost: Sort&#x2F;hash memory × memory cost factor</code></pre><hr /><h3 id="算子成本模型">算子成本模型</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/cost_model.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Cost constants (tunable)</span>    <span class="token keyword">pub</span> seq_page_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Cost of sequential page read</span>    <span class="token keyword">pub</span> random_page_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>   <span class="token comment">// Cost of random page read</span>    <span class="token keyword">pub</span> cpu_tuple_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>     <span class="token comment">// CPU cost per tuple</span>    <span class="token keyword">pub</span> cpu_index_tuple_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// CPU cost per index tuple</span>    <span class="token keyword">pub</span> cpu_operator_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// CPU cost per operator evaluation</span>    <span class="token keyword">pub</span> memory_cost_per_kb<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span> <span class="token comment">// Memory cost per KB</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Default</span> <span class="token keyword">for</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            seq_page_cost<span class="token punctuation">:</span> <span class="token number">1.0</span><span class="token punctuation">,</span>            random_page_cost<span class="token punctuation">:</span> <span class="token number">4.0</span><span class="token punctuation">,</span>  <span class="token comment">// Random I/O is ~4x slower</span>            cpu_tuple_cost<span class="token punctuation">:</span> <span class="token number">0.01</span><span class="token punctuation">,</span>            cpu_index_tuple_cost<span class="token punctuation">:</span> <span class="token number">0.005</span><span class="token punctuation">,</span>            cpu_operator_cost<span class="token punctuation">:</span> <span class="token number">0.0025</span><span class="token punctuation">,</span>            memory_cost_per_kb<span class="token punctuation">:</span> <span class="token number">0.001</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Cost of sequential scan</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">seq_scan_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        num_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> io_cost <span class="token operator">=</span> num_pages <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>seq_page_cost<span class="token punctuation">;</span>        <span class="token keyword">let</span> cpu_cost <span class="token operator">=</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_tuple_cost<span class="token punctuation">;</span>                <span class="token keyword">let</span> filter_cost <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>_filter<span class="token punctuation">)</span> <span class="token operator">=</span> filter <span class="token punctuation">&#123;</span>            num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token number">0.0</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> <span class="token number">0.0</span><span class="token punctuation">,</span>            total<span class="token punctuation">:</span> io_cost <span class="token operator">+</span> cpu_cost <span class="token operator">+</span> filter_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_rows_after_filter</span><span class="token punctuation">(</span>num_rows<span class="token punctuation">,</span> filter<span class="token punctuation">)</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Would be computed from schema</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of index scan</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">index_scan_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>        table_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexCondition</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Estimate how many index pages we need to read</span>        <span class="token keyword">let</span> index_selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_index_selectivity</span><span class="token punctuation">(</span>condition<span class="token punctuation">,</span> index<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> index_pages_to_read <span class="token operator">=</span> <span class="token punctuation">(</span>index<span class="token punctuation">.</span>leaf_pages <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>                <span class="token comment">// Estimate how many table pages we need to read</span>        <span class="token keyword">let</span> table_pages_to_read <span class="token operator">=</span> <span class="token keyword">if</span> index_selectivity <span class="token operator">></span> <span class="token number">0.3</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// High selectivity → sequential scan of table</span>            table_pages        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Low selectivity → random access</span>            <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> io_cost <span class="token operator">=</span> index_pages_to_read <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>random_page_cost            <span class="token operator">+</span> table_pages_to_read <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>random_page_cost<span class="token punctuation">;</span>                <span class="token keyword">let</span> cpu_cost <span class="token operator">=</span> <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_index_tuple_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> <span class="token number">0.0</span><span class="token punctuation">,</span>            total<span class="token punctuation">:</span> io_cost <span class="token operator">+</span> cpu_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of nested loop join</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">nested_loop_join_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        outer_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        inner_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        join_selectivity<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Outer is scanned once</span>        <span class="token keyword">let</span> outer_total <span class="token operator">=</span> outer_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Inner is scanned once per outer row</span>        <span class="token keyword">let</span> inner_total <span class="token operator">=</span> inner_cost<span class="token punctuation">.</span>total <span class="token operator">*</span> outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                <span class="token comment">// CPU cost for join condition evaluation</span>        <span class="token keyword">let</span> join_cpu_cost <span class="token operator">=</span> outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> inner_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span>             <span class="token operator">*</span> join_selectivity <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> outer_cost<span class="token punctuation">.</span>startup<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> outer_total <span class="token operator">+</span> inner_total <span class="token operator">+</span> join_cpu_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token punctuation">(</span>outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> inner_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> join_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of hash join</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">hash_join_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        left_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        right_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        join_selectivity<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Build phase: scan and hash the smaller relation</span>        <span class="token keyword">let</span> build_cost <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>        <span class="token keyword">let</span> build_memory <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">32.0</span><span class="token punctuation">;</span>  <span class="token comment">// Estimate 32 bytes per row</span>                <span class="token comment">// Probe phase: scan the larger relation and probe hash table</span>        <span class="token keyword">let</span> probe_cost <span class="token operator">=</span> right_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>        <span class="token keyword">let</span> probe_cpu <span class="token operator">=</span> right_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>                <span class="token comment">// Output cost</span>        <span class="token keyword">let</span> output_rows <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> right_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> join_selectivity<span class="token punctuation">;</span>        <span class="token keyword">let</span> output_cpu <span class="token operator">=</span> output_rows <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_tuple_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> build_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> probe_cost <span class="token operator">+</span> probe_cpu <span class="token operator">+</span> output_cpu<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> output_rows<span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of sort</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">sort_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        input_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        sort_keys<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">SortKey</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> input_total <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Check if sort fits in memory</span>        <span class="token keyword">let</span> sort_memory <span class="token operator">=</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">64.0</span><span class="token punctuation">;</span>  <span class="token comment">// Estimate 64 bytes per row</span>        <span class="token keyword">let</span> work_mem <span class="token operator">=</span> <span class="token number">4</span> <span class="token operator">*</span> <span class="token number">1024</span> <span class="token operator">*</span> <span class="token number">1024.0</span><span class="token punctuation">;</span>  <span class="token comment">// 4MB work memory</span>                <span class="token keyword">let</span> sort_cpu <span class="token operator">=</span> <span class="token keyword">if</span> sort_memory <span class="token operator">&lt;=</span> work_mem <span class="token punctuation">&#123;</span>            <span class="token comment">// In-memory sort: O(n log n)</span>            num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_rows<span class="token punctuation">.</span><span class="token function">log2</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// External sort: 2 passes</span>            <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_rows<span class="token punctuation">.</span><span class="token function">log2</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token number">2.0</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost                <span class="token operator">+</span> sort_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> cpu_per_key <span class="token operator">=</span> sort_keys<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> input_total <span class="token operator">+</span> sort_cpu <span class="token operator">+</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> cpu_per_key<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> input_total <span class="token operator">+</span> sort_cpu <span class="token operator">+</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> cpu_per_key<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> num_rows<span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of hash aggregate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">hash_aggregate_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        input_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        num_groups<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        num_aggregates<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> input_total <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Build hash table of groups</span>        <span class="token keyword">let</span> build_memory <span class="token operator">=</span> num_groups <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">64.0</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> build_cpu <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>                <span class="token comment">// Aggregate computation</span>        <span class="token keyword">let</span> aggregate_cpu <span class="token operator">=</span> num_groups <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_aggregates <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> input_total <span class="token operator">+</span> build_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> input_total <span class="token operator">+</span> build_cpu <span class="token operator">+</span> aggregate_cpu<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> num_groups<span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_rows_after_filter</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span> filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> filter <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span> <span class="token operator">=></span> num_rows<span class="token punctuation">,</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Simplified selectivity estimation</span>        <span class="token comment">// In practice, this would use statistics and histograms</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token number">0.01</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1% match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">|</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token number">0.33</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1/3 match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">|</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1/2 match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token number">0.1</span><span class="token punctuation">,</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_index_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> condition<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexCondition</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> condition <span class="token punctuation">&#123;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">(</span>_<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Equality: 1 / distinct_keys</span>                <span class="token number">1.0</span> <span class="token operator">/</span> index<span class="token punctuation">.</span>distinct_keys<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Range: estimate 10% of index</span>                <span class="token number">0.1</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">InList</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// IN list: |values| / distinct_keys</span>                values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> index<span class="token punctuation">.</span>distinct_keys<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Cost</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> startup<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token comment">// Cost to return first row</span>    <span class="token keyword">pub</span> total<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Cost to return all rows</span>    <span class="token keyword">pub</span> rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>       <span class="token comment">// Estimated output rows</span>    <span class="token keyword">pub</span> width<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token comment">// Estimated row width in bytes</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-带动态规划的连接顺序">4 带动态规划的连接顺序</h2><h3 id="连接顺序问题">连接顺序问题</h3><p><strong>对于 n 个表，有 (n-1)! 种可能的连接顺序：</strong></p><pre class="language-none"><code class="language-none">3 tables: 2! &#x3D; 2 orders5 tables: 4! &#x3D; 24 orders10 tables: 9! &#x3D; 362,880 orders</code></pre><p><strong>暴力破解是不可能的。</strong> 我们需要动态规划。</p><hr /><h3 id="DP-连接顺序算法">DP 连接顺序算法</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/join_order.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">JoinOrderOptimizer</span> <span class="token punctuation">&#123;</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">JoinOrderOptimizer</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Find the best join order using dynamic programming</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">optimize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> tables<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">String</span><span class="token punctuation">]</span><span class="token punctuation">,</span> conditions<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">JoinCondition</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> n <span class="token operator">=</span> tables<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// dp[i] = best plan for subset represented by bitmask i</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> dp<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token keyword">u64</span><span class="token punctuation">,</span> <span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> costs<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token keyword">u64</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Base case: single table scans</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>i<span class="token punctuation">,</span> table<span class="token punctuation">)</span> <span class="token keyword">in</span> tables<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">enumerate</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> mask <span class="token operator">=</span> <span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">;</span>            <span class="token keyword">let</span> plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_scan_plan</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                        dp<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>mask<span class="token punctuation">,</span> plan<span class="token punctuation">)</span><span class="token punctuation">;</span>            costs<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>mask<span class="token punctuation">,</span> cost<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Build up larger subsets</span>        <span class="token keyword">for</span> size <span class="token keyword">in</span> <span class="token number">2</span><span class="token punctuation">..=</span>n <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> subset <span class="token keyword">in</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subsets_of_size</span><span class="token punctuation">(</span>n<span class="token punctuation">,</span> size<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> subset_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>subset<span class="token punctuation">)</span><span class="token punctuation">;</span>                                <span class="token comment">// Try all ways to split this subset</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_plan<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">None</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_cost <span class="token operator">=</span> <span class="token keyword">f64</span><span class="token punctuation">::</span><span class="token constant">INFINITY</span><span class="token punctuation">;</span>                                <span class="token keyword">for</span> split <span class="token keyword">in</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">split_subset</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>subset<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">let</span> left_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>split<span class="token number">.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">let</span> right_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>split<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                                        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>left_plan<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>right_plan<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=</span>                         <span class="token punctuation">(</span>dp<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left_mask<span class="token punctuation">)</span><span class="token punctuation">,</span> dp<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>right_mask<span class="token punctuation">)</span><span class="token punctuation">)</span>                     <span class="token punctuation">&#123;</span>                        <span class="token comment">// Try different join algorithms</span>                        <span class="token keyword">for</span> join_plan <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_join_plans</span><span class="token punctuation">(</span>                            left_plan<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                            right_plan<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                            conditions<span class="token punctuation">,</span>                        <span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>join_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                            <span class="token keyword">if</span> cost <span class="token operator">&lt;</span> best_cost <span class="token punctuation">&#123;</span>                                best_cost <span class="token operator">=</span> cost<span class="token punctuation">;</span>                                best_plan <span class="token operator">=</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>join_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                            <span class="token punctuation">&#125;</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span> <span class="token operator">=</span> best_plan <span class="token punctuation">&#123;</span>                    dp<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>subset_mask<span class="token punctuation">,</span> plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                    costs<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>subset_mask<span class="token punctuation">,</span> best_cost<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Return the best plan for all tables</span>        <span class="token keyword">let</span> all_mask <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> n<span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">;</span>        dp<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>all_mask<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_scan_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if we have useful indexes</span>        <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>index<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_useful_index</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token operator">&amp;</span>stats<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span>                table<span class="token punctuation">:</span> table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                alias<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>                index<span class="token punctuation">:</span> index<span class="token punctuation">.</span>name<span class="token punctuation">,</span>                columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// All columns</span>                condition<span class="token punctuation">:</span> <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span> high<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span>                table<span class="token punctuation">:</span> table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                alias<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>                columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span>                filter<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_join_plans</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        left<span class="token punctuation">:</span> <span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span>        conditions<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">JoinCondition</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plans <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Get join condition</span>        <span class="token keyword">let</span> condition <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_join_condition</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left<span class="token punctuation">,</span> <span class="token operator">&amp;</span>right<span class="token punctuation">,</span> conditions<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Nested loop join (always possible)</span>        plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">NestedLoopJoin</span> <span class="token punctuation">&#123;</span>            left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Hash join (if equi-join)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_equi_join</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>condition<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span>                left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Merge join (if inputs can be sorted on join keys)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">can_merge_join</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left<span class="token punctuation">,</span> <span class="token operator">&amp;</span>right<span class="token punctuation">,</span> <span class="token operator">&amp;</span>condition<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">MergeJoin</span> <span class="token punctuation">&#123;</span>                left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                plans    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">subsets_of_size</span><span class="token punctuation">(</span>n<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">>></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Generate all subsets of &#123;0, 1, ..., n-1&#125; with given size</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> result <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">generate_subsets</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> n<span class="token punctuation">,</span> size<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> result<span class="token punctuation">)</span><span class="token punctuation">;</span>        result    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">generate_subsets</span><span class="token punctuation">(</span>        start<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        n<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        current<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span>        result<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> current<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> size <span class="token punctuation">&#123;</span>            result<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>current<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">return</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">for</span> i <span class="token keyword">in</span> start<span class="token punctuation">..</span>n <span class="token punctuation">&#123;</span>            current<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>i<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">generate_subsets</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> n<span class="token punctuation">,</span> size<span class="token punctuation">,</span> current<span class="token punctuation">,</span> result<span class="token punctuation">)</span><span class="token punctuation">;</span>            current<span class="token punctuation">.</span><span class="token function">pop</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">split_subset</span><span class="token punctuation">(</span>subset<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Generate all non-empty proper splits of the subset</span>        <span class="token keyword">let</span> n <span class="token operator">=</span> subset<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> splits <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Use bitmask to generate all splits</span>        <span class="token keyword">for</span> mask <span class="token keyword">in</span> <span class="token number">1</span><span class="token punctuation">..</span><span class="token punctuation">(</span><span class="token number">1</span> <span class="token operator">&lt;&lt;</span> <span class="token punctuation">(</span>n <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> right <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>n <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> i <span class="token operator">==</span> n <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">&#123;</span>                    right<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> mask <span class="token operator">&amp;</span> <span class="token punctuation">(</span><span class="token number">1</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    right<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>                        <span class="token keyword">if</span> <span class="token operator">!</span>left<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span>right<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                splits<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> right<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                splits    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">subset_to_mask</span><span class="token punctuation">(</span>subset<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> mask <span class="token operator">=</span> <span class="token number">0u64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> <span class="token operator">&amp;</span>i <span class="token keyword">in</span> subset <span class="token punctuation">&#123;</span>            mask <span class="token operator">|=</span> <span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        mask    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="连接顺序示例">连接顺序示例</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> <span class="token operator">*</span><span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">JOIN</span> products p <span class="token keyword">ON</span> o<span class="token punctuation">.</span>product_id <span class="token operator">=</span> p<span class="token punctuation">.</span>id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>balance <span class="token operator">></span> <span class="token number">100</span></code></pre><p><strong>动态规划进度：</strong></p><pre class="language-none"><code class="language-none">Iteration 1 (single tables):  &#123;users&#125;: SeqScan cost&#x3D;100, rows&#x3D;10000  &#123;orders&#125;: IndexScan cost&#x3D;50, rows&#x3D;50000  &#123;products&#125;: SeqScan cost&#x3D;10, rows&#x3D;1000Iteration 2 (two tables):  &#123;users, orders&#125;:     - users ⋈ orders (hash): cost&#x3D;600, rows&#x3D;5000    - orders ⋈ users (nested): cost&#x3D;800, rows&#x3D;5000    → Best: HashJoin cost&#x3D;600      &#123;orders, products&#125;:    - orders ⋈ products (hash): cost&#x3D;200, rows&#x3D;10000    → Best: HashJoin cost&#x3D;200Iteration 3 (three tables):  &#123;users, orders, products&#125;:    - &#123;users, orders&#125; ⋈ products: cost&#x3D;800, rows&#x3D;1000    - &#123;orders, products&#125; ⋈ users: cost&#x3D;700, rows&#x3D;1000 ← Best!    - users ⋈ &#123;orders, products&#125;: cost&#x3D;900, rows&#x3D;1000    → Best: (orders ⋈ products) ⋈ users</code></pre><p><strong>最终计划：</strong></p><pre class="language-none"><code class="language-none">HashAggregate  └─ HashJoin (u.id &#x3D; o.user_id)      ├─ SeqScan (users) [balance &gt; 100]      └─ HashJoin (o.product_id &#x3D; p.id)          ├─ IndexScan (orders)          └─ SeqScan (products)</code></pre><hr /><h2 id="5-索引选择">5 索引选择</h2><h3 id="选择正确的索引">选择正确的索引</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/index_selector.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexSelector</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">IndexSelector</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Find the best index for a query</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">select_index</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">IndexSelection</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> indexes <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table_indexes</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_index<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">IndexSelection</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> best_score <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> index <span class="token keyword">in</span> indexes <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> score <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">score_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> score <span class="token operator">></span> best_score <span class="token punctuation">&#123;</span>                best_score <span class="token operator">=</span> score<span class="token punctuation">;</span>                best_index <span class="token operator">=</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">IndexSelection</span> <span class="token punctuation">&#123;</span>                    index<span class="token punctuation">:</span> index<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    score<span class="token punctuation">,</span>                    usable_predicates<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_usable_predicates</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                best_index    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">score_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> score <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>                <span class="token comment">// Check if index columns are used in predicates</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>i<span class="token punctuation">,</span> col<span class="token punctuation">)</span> <span class="token keyword">in</span> index<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">enumerate</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> predicate <span class="token keyword">in</span> predicates <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_uses_column</span><span class="token punctuation">(</span>predicate<span class="token punctuation">,</span> col<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Earlier columns in index are more valuable</span>                    <span class="token keyword">let</span> position_weight <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">/</span> <span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                    score <span class="token operator">+=</span> position_weight <span class="token operator">*</span> <span class="token number">100.0</span><span class="token punctuation">;</span>                                        <span class="token comment">// Equality is more valuable than range</span>                    <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_equality_predicate</span><span class="token punctuation">(</span>predicate<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                        score <span class="token operator">*=</span> <span class="token number">2.0</span><span class="token punctuation">;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Bonus for covering indexes (all columns in index)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_covering_index</span><span class="token punctuation">(</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            score <span class="token operator">*=</span> <span class="token number">1.5</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Bonus for unique indexes</span>        <span class="token keyword">if</span> index<span class="token punctuation">.</span>is_unique <span class="token punctuation">&#123;</span>            score <span class="token operator">*=</span> <span class="token number">1.3</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                score    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">find_usable_predicates</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        predicates            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>p<span class="token closure-punctuation punctuation">|</span></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_uses_index_column</span><span class="token punctuation">(</span>p<span class="token punctuation">,</span> index<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">predicate_uses_column</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicate<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> predicate <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expr_references_column</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> column<span class="token punctuation">)</span> <span class="token operator">||</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expr_references_column</span><span class="token punctuation">(</span>right<span class="token punctuation">,</span> column<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">)</span> <span class="token operator">=></span> ident<span class="token punctuation">.</span>value <span class="token operator">==</span> column<span class="token punctuation">,</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span>idents<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                idents<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">any</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>i<span class="token closure-punctuation punctuation">|</span></span> i<span class="token punctuation">.</span>value <span class="token operator">==</span> column<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token boolean">false</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">is_equality_predicate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicate<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> predicate <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">matches!</span><span class="token punctuation">(</span>op<span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token boolean">false</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">is_covering_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if all columns referenced in predicates are in the index</span>        <span class="token keyword">let</span> referenced_columns <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">extract_referenced_columns</span><span class="token punctuation">(</span>predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>        referenced_columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">all</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>col<span class="token closure-punctuation punctuation">|</span></span> index<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span>col<span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexSelection</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> index<span class="token punctuation">:</span> <span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> score<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> usable_predicates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-完整的优化管线">6 完整的优化管线</h2><h3 id="从-AST-到物理计划">从 AST 到物理计划</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/optimizer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">QueryOptimizer</span> <span class="token punctuation">&#123;</span>    catalog<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Catalog</span><span class="token operator">></span><span class="token punctuation">,</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span>    join_optimizer<span class="token punctuation">:</span> <span class="token class-name">JoinOrderOptimizer</span><span class="token punctuation">,</span>    index_selector<span class="token punctuation">:</span> <span class="token class-name">IndexSelector</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">QueryOptimizer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">optimize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> ast<span class="token punctuation">:</span> <span class="token class-name">SelectStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Phase 1: Create logical plan</span>        <span class="token keyword">let</span> logical_plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_logical_plan</span><span class="token punctuation">(</span>ast<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 2: Apply logical optimizations</span>        <span class="token keyword">let</span> optimized_logical <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">apply_logical_optimizations</span><span class="token punctuation">(</span>logical_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 3: Generate physical plans</span>        <span class="token keyword">let</span> physical_plans <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">generate_physical_plans</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>optimized_logical<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 4: Choose best plan based on cost</span>        <span class="token keyword">let</span> best_plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">choose_best_plan</span><span class="token punctuation">(</span>physical_plans<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>best_plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_logical_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> ast<span class="token punctuation">:</span> <span class="token class-name">SelectStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Start with FROM clause</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_table_with_joins</span><span class="token punctuation">(</span>ast<span class="token punctuation">.</span>from<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Add WHERE filter</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>where_clause<span class="token punctuation">)</span> <span class="token operator">=</span> ast<span class="token punctuation">.</span>where_clause <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Filter</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                predicate<span class="token punctuation">:</span> where_clause<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add GROUP BY / aggregates</span>        <span class="token keyword">if</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>group_by<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>having<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_aggregate</span><span class="token punctuation">(</span>plan<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>group_by<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>having<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add SELECT projection</span>        plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_projection</span><span class="token punctuation">(</span>plan<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>projections<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Add DISTINCT</span>        <span class="token keyword">if</span> ast<span class="token punctuation">.</span>distinct <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Distinct</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add ORDER BY</span>        <span class="token keyword">if</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>order_by<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                order_by<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>order_by<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add LIMIT/OFFSET</span>        <span class="token keyword">if</span> ast<span class="token punctuation">.</span>limit<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ast<span class="token punctuation">.</span>offset<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                limit<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>limit<span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">::</span><span class="token constant">MAX</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                offset<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>offset<span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">apply_logical_optimizations</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> plan<span class="token punctuation">:</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">LogicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> optimized <span class="token operator">=</span> plan<span class="token punctuation">;</span>                <span class="token comment">// Predicate pushdown</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_pushdown</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Projection pruning</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">projection_pruning</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Constant folding</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">constant_folding</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Subquery unnesting</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">unnest_subqueries</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                optimized    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">generate_physical_plans</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> logical<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">LogicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plans <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Generate all reasonable physical alternatives</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">generate_plans_recursive</span><span class="token punctuation">(</span>logical<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> plans<span class="token punctuation">)</span><span class="token punctuation">;</span>                plans    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">choose_best_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> plans<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> plans<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">OptimizerError</span><span class="token punctuation">::</span><span class="token class-name">NoPlansGenerated</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_plan <span class="token operator">=</span> plans<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> best_cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cost_model<span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>best_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> plan <span class="token keyword">in</span> plans<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">skip</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cost_model<span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> cost <span class="token operator">&lt;</span> best_cost <span class="token punctuation">&#123;</span>                best_cost <span class="token operator">=</span> cost<span class="token punctuation">;</span>                best_plan <span class="token operator">=</span> plan<span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>best_plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-构建的挑战">7 用 Rust 构建的挑战</h2><h3 id="挑战-1：递归计划类型">挑战 1：递归计划类型</h3><p><strong>问题：</strong> PhysicalPlan 是深度递归的，难以模式匹配。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Complex nested matching</span><span class="token keyword">match</span> plan <span class="token punctuation">&#123;</span>    <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> left<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：访问者模式</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean traversal</span><span class="token keyword">pub</span> <span class="token keyword">trait</span> <span class="token type-definition class-name">PlanVisitor</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">visit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> plan<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">collect_scan_tables</span><span class="token punctuation">(</span>plan<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> visitor <span class="token operator">=</span> <span class="token class-name">TableCollector</span> <span class="token punctuation">&#123;</span> tables<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>    visitor<span class="token punctuation">.</span><span class="token function">visit</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>    visitor<span class="token punctuation">.</span>tables<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-2：成本类型精度">挑战 2：成本类型精度</h3><p><strong>问题：</strong> 成本可能非常大或非常小。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ f32 loses precision</span><span class="token keyword">let</span> cost<span class="token punctuation">:</span> <span class="token keyword">f32</span> <span class="token operator">=</span> <span class="token number">1000000.0</span> <span class="token operator">+</span> <span class="token number">0.0001</span><span class="token punctuation">;</span>  <span class="token comment">// Loses 0.0001!</span></code></pre><p><strong>解决方案：使用 f64</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Better precision</span><span class="token keyword">let</span> cost<span class="token punctuation">:</span> <span class="token keyword">f64</span> <span class="token operator">=</span> <span class="token number">1000000.0</span> <span class="token operator">+</span> <span class="token number">0.0001</span><span class="token punctuation">;</span>  <span class="token comment">// Preserves both</span></code></pre><hr /><h3 id="挑战-3：统计数据生命周期">挑战 3：统计数据生命周期</h3><p><strong>问题：</strong> 统计数据需要在优化过程中共享。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Optimizer</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">StatisticsCatalog</span><span class="token punctuation">,</span>  <span class="token comment">// Too large to clone</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：Arc 用于共享所有权</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Optimizer</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-AI-如何加速这项工作">8 AI 如何加速这项工作</h2><h3 id="AI-做对了什么">AI 做对了什么</h3><table><thead><tr><th>任务</th><th>AI 贡献</th></tr></thead><tbody><tr><td><strong>成本模型结构</strong></td><td>CPU/IO/内存成本的良好分解</td></tr><tr><td><strong>DP 连接顺序</strong></td><td>正确的基于位元掩码的子集生成</td></tr><tr><td><strong>统计数据设计</strong></td><td>直方图类型、MCV 列表</td></tr><tr><td><strong>索引评分</strong></td><td>位置权重、相等奖励</td></tr></tbody></table><hr /><h3 id="AI-做错了什么">AI 做错了什么</h3><table><thead><tr><th>问题</th><th>发生什么事</th></tr></thead><tbody><tr><td><strong>选择性估计</strong></td><td>初稿使用固定值，不是直方图</td></tr><tr><td><strong>连接成本公式</strong></td><td>忽略了内部每外部行扫描一次</td></tr><tr><td><strong>排序成本</strong></td><td>没有区分内存内与外部排序</td></tr><tr><td><strong>覆盖索引</strong></td><td>初始设计没有考虑仅索引扫描</td></tr></tbody></table><p><strong>模式：</strong> AI 处理结构良好。数值公式和边界情况需要手动验证。</p><hr /><h3 id="示例：调试连接成本">示例：调试连接成本</h3><p><strong>我问 AI 的问题：</strong></p><blockquote><p>“Hash join 成本似乎错误。创建哈希表应该是启动成本，不是总成本。”</p></blockquote><p><strong>我学到的：</strong></p><ol><li><strong>启动成本：</strong> 返回第一行的成本</li><li><strong>总成本：</strong> 返回所有行的成本</li><li>Hash 创建是启动（必须在探测前完成）</li><li>探测成本随输出行数扩展</li></ol><p><strong>结果：</strong> 修复成本模型：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>    startup<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> build_memory <span class="token operator">*</span> memory_factor<span class="token punctuation">,</span>  <span class="token comment">// Before first row</span>    total<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> probe_cost <span class="token operator">+</span> output_cpu<span class="token punctuation">,</span>         <span class="token comment">// All rows</span>    rows<span class="token punctuation">:</span> output_rows<span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="总结：查询优化器一张图">总结：查询优化器一张图</h2><pre class="language-MERMAID_BASE64_606" data-language="MERMAID_BASE64_606"><code class="language-MERMAID_BASE64_606">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiSW5wdXQiCiAgICAgICAgQVtTUUwgQVNUXSAtLT4gQltTdGF0aXN0aWNzIENhdGFsb2ddCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIkxvZ2ljYWwgT3B0aW1pemF0aW9uIgogICAgICAgIENbQ3JlYXRlIExvZ2ljYWwgUGxhbl0gLS0+IERbUHJlZGljYXRlIFB1c2hkb3duXQogICAgICAgIEQgLS0+IEVbUHJvamVjdGlvbiBQcnVuaW5nXQogICAgICAgIEUgLS0+IEZbQ29uc3RhbnQgRm9sZGluZ10KICAgICAgICBGIC0tPiBHW1N1YnF1ZXJ5IFVubmVzdGluZ10KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiUGh5c2ljYWwgT3B0aW1pemF0aW9uIgogICAgICAgIEcgLS0+IEhbR2VuZXJhdGUgUGh5c2ljYWwgUGxhbnNdCiAgICAgICAgSCAtLT4gSVtTZXFTY2FuLCBJbmRleFNjYW5dCiAgICAgICAgSCAtLT4gSltOZXN0ZWRMb29wLCBIYXNoLCBNZXJnZSBKb2luXQogICAgICAgIEggLS0+IEtbU29ydCwgSGFzaEFnZ3JlZ2F0ZV0KICAgICAgICBJICYgSiAmIEsgLS0+IExbQ29zdCBFc3RpbWF0aW9uXQogICAgICAgIEwgLS0+IE1bQ2hvb3NlIEJlc3QgUGxhbl0KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiU3RhdGlzdGljcyIKICAgICAgICBCIC0tPiBOW1RhYmxlIFN0YXRzOiByb3dfY291bnQsIHBhZ2VzXQogICAgICAgIEIgLS0+IE9bQ29sdW1uIFN0YXRzOiBoaXN0b2dyYW0sIE1DVl0KICAgICAgICBCIC0tPiBQW0luZGV4IFN0YXRzOiBkaXN0aW5jdF9rZXlzXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJPdXRwdXQiCiAgICAgICAgTSAtLT4gUVtQaHlzaWNhbCBQbGFuXQogICAgZW5kCiAgICAKICAgIHN0eWxlIEEgZmlsbDojZTNmMmZkLHN0cm9rZTojMTk3NmQyCiAgICBzdHlsZSBRIGZpbGw6I2U4ZjVlOSxzdHJva2U6IzM4OGUzYwogICAgc3R5bGUgTCBmaWxsOiNmZmYzZTAsc3Ryb2tlOiNmNTdjMDA&#x3D;</code></pre><p><strong>关键要点：</strong></p><table><thead><tr><th>概念</th><th>为什么重要</th></tr></thead><tbody><tr><td><strong>逻辑 vs. 物理</strong></td><td>分离 WHAT 与 HOW</td></tr><tr><td><strong>统计数据</strong></td><td>准确的成本估计需要数据</td></tr><tr><td><strong>成本模型</strong></td><td>CPU + I/O + memory = total cost</td></tr><tr><td><strong>DP 连接顺序</strong></td><td>无需暴力破解找到最佳顺序</td></tr><tr><td><strong>索引选择</strong></td><td>为谓词选择最佳索引</td></tr><tr><td><strong>启动 vs. 总计</strong></td><td>第一行延迟 vs. 吞吐量</td></tr></tbody></table><hr /><p><strong>进一步阅读：</strong></p><ul><li>“Database Management Systems” by Ramakrishnan (Ch. 15: Query Optimization)</li><li>“Readings in Database Systems” (Red Book) - Query Optimization chapter</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/tree/master/src/backend/optimizer"><code>src/backend/optimizer/</code></a></li><li>“Cost-Based Oracle Fundamentals” by Jonathan Lewis</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 之旅第七部分：构建基于成本的查询优化器。深入探讨统计数据收集、成本模型、带动态规划的连接顺序，以及索引选择。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 建構 PostgreSQL 相容資料庫：帶統計資料的基於成本的查詢最佳化器</title>
    <link href="https://neo01.com/zh-TW/2026/03/Database-Rust-Query-Optimizer/"/>
    <id>https://neo01.com/zh-TW/2026/03/Database-Rust-Query-Optimizer/</id>
    <published>2026-03-06T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:27.325Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-TW/2026/03/Database-Rust-SQL-Parser/">第六部分</a> 中，我們建構了一個產生 AST 的 SQL 解析器。但有個問題。</p><p><strong>相同的查詢可以用許多不同的方式執行：</strong></p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> u<span class="token punctuation">.</span>name<span class="token punctuation">,</span> o<span class="token punctuation">.</span>total<span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>balance <span class="token operator">></span> <span class="token number">100</span></code></pre><p><strong>可能的執行計劃：</strong></p><pre class="language-none"><code class="language-none">Plan A:                          Plan B:                          Plan C:1. Scan users                    1. Scan orders                   1. Scan users (balance &gt; 100)2. Filter (balance &gt; 100)        2. Filter (exists in users)      2. Index lookup on orders3. Scan orders                   3. Scan users                    3. Hash join4. Hash join                     4. Nested loop join              4. Sort by name5. Sort                          5. SortCost: 1500                       Cost: 800                        Cost: 200 ← 最佳！</code></pre><p><strong>我們如何自動找到計劃 C？</strong></p><p>今天：在 Rust 中建構基於成本的查詢最佳化器——帶統計資料、成本模型和用於連線順序的動態規劃。</p><hr /><h2 id="1-邏輯-vs-物理計劃">1 邏輯 vs. 物理計劃</h2><h3 id="兩階段方法">兩階段方法</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│              Query Optimization Pipeline                     │├─────────────────────────────────────────────────────────────┤│                                                              ││  SQL AST                                                     ││     │                                                        ││     ▼                                                        ││  ┌──────────────────────────────────────────────────────┐   ││  │  Logical Plan (WHAT to compute)                      │   ││  │  - Logical Scan: users                               │   ││  │  - Logical Filter: balance &gt; 100                     │   ││  │  - Logical Hash Join: u.id &#x3D; o.user_id               │   ││  └──────────────────────────────────────────────────────┘   ││     │                                                        ││     ▼ Optimization                                           ││                                                              ││  ┌──────────────────────────────────────────────────────┐   ││  │  Physical Plan (HOW to compute)                      │   ││  │  - Index Scan: users (balance &gt; 100)                 │   ││  │  - Index Scan: orders (user_id index)                │   ││  │  - Nested Loop Join                                  │   ││  └──────────────────────────────────────────────────────┘   ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="邏輯計劃運算子">邏輯計劃運算子</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/logical_plan.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">LogicalPlan</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Scan a table</span>    <span class="token class-name">TableScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        projection<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Column indices</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Filter rows</span>    <span class="token class-name">Filter</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        predicate<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Project columns/expressions</span>    <span class="token class-name">Projection</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        expressions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Join two relations</span>    <span class="token class-name">Join</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">JoinCondition</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Aggregate (GROUP BY)</span>    <span class="token class-name">Aggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Sort (ORDER BY)</span>    <span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SortKey</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Limit</span>    <span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        limit<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        offset<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Distinct</span>    <span class="token class-name">Distinct</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinCondition</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">On</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Using</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>    <span class="token class-name">RightOuter</span><span class="token punctuation">,</span>    <span class="token class-name">FullOuter</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SortKey</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> expression<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> ascending<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> nulls_first<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AggregateFunction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> func<span class="token punctuation">:</span> <span class="token class-name">AggregateFunc</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> argument<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">AggregateFunc</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Count</span><span class="token punctuation">,</span>    <span class="token class-name">Sum</span><span class="token punctuation">,</span>    <span class="token class-name">Avg</span><span class="token punctuation">,</span>    <span class="token class-name">Min</span><span class="token punctuation">,</span>    <span class="token class-name">Max</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="物理計劃運算子">物理計劃運算子</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/physical_plan.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Full table scan</span>    <span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Index scan</span>    <span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">IndexCondition</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Nested loop join</span>    <span class="token class-name">NestedLoopJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Hash join</span>    <span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Merge join (requires sorted input)</span>    <span class="token class-name">MergeJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Sort</span>    <span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SortKey</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Hash aggregate</span>    <span class="token class-name">HashAggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Stream aggregate (requires sorted input)</span>    <span class="token class-name">StreamAggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Limit</span>    <span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        limit<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        offset<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">IndexCondition</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Eq</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span> high<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">InList</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="2-統計資料收集">2 統計資料收集</h2><h3 id="為什麼統計資料很重要">為什麼統計資料很重要</h3><p><strong>沒有統計資料：</strong> 所有計劃看起來都一樣。</p><p><strong>有統計資料：</strong> 我們可以準確估計成本。</p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Query</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> balance <span class="token operator">></span> <span class="token number">100</span><span class="token comment">-- Scenario A: balance is uniformly distributed 0-1000</span><span class="token comment">-- → ~90% of rows match → SeqScan is better</span><span class="token comment">-- Scenario B: balance is skewed, only 1% have > 100</span><span class="token comment">-- → IndexScan is better</span></code></pre><hr /><h3 id="統計資料結構">統計資料結構</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/statistics.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> page_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> average_row_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">ColumnStatistics</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> indexes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">IndexStatistics</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> last_analyzed<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> column_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_fraction<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Fraction of NULL values (0.0 - 1.0)</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>     <span class="token comment">// Number of distinct values</span>    <span class="token keyword">pub</span> most_common_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (value, frequency)</span>    <span class="token keyword">pub</span> histogram<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Histogram</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> min_value<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_value<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Histogram</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Equi-width histogram (equal bucket sizes)</span>    <span class="token class-name">EquiWidth</span> <span class="token punctuation">&#123;</span>        buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Bucket</span><span class="token operator">></span><span class="token punctuation">,</span>        min<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>        max<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">/// Equi-depth histogram (equal rows per bucket)</span>    <span class="token class-name">EquiDepth</span> <span class="token punctuation">&#123;</span>        buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Bucket</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Bucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> index_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> is_unique<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> is_primary<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> leaf_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_keys<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> average_leaf_per_key<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="收集統計資料">收集統計資料</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/analyzer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsAnalyzer</span> <span class="token punctuation">&#123;</span>    buffer_pool<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">BufferPool</span><span class="token operator">></span><span class="token punctuation">,</span>    storage<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StorageEngine</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatisticsAnalyzer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> stats <span class="token operator">=</span> <span class="token class-name">TableStatistics</span> <span class="token punctuation">&#123;</span>            table_name<span class="token punctuation">:</span> table_name<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            page_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            average_row_size<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            columns<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            indexes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            last_analyzed<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token punctuation">::</span><span class="token function">now</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Scan all pages to collect statistics</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> total_size <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> column_values<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">>></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>storage<span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_name<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>page_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token keyword">for</span> row <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                stats<span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                total_size <span class="token operator">+=</span> row<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Collect column values</span>                <span class="token keyword">for</span> <span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> value<span class="token punctuation">)</span> <span class="token keyword">in</span> row<span class="token punctuation">.</span><span class="token function">columns</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    column_values                        <span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>col_name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">.</span><span class="token function">or_insert_with</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token punctuation">::</span>new<span class="token punctuation">)</span>                        <span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">if</span> stats<span class="token punctuation">.</span>row_count <span class="token operator">></span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>            stats<span class="token punctuation">.</span>average_row_size <span class="token operator">=</span> total_size <span class="token operator">/</span> stats<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Compute column statistics</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> values<span class="token punctuation">)</span> <span class="token keyword">in</span> column_values <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> col_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">compute_column_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>col_name<span class="token punctuation">,</span> <span class="token operator">&amp;</span>values<span class="token punctuation">)</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> col_stats<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Collect index statistics</span>        <span class="token keyword">for</span> index <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>storage<span class="token punctuation">.</span><span class="token function">get_table_indexes</span><span class="token punctuation">(</span>table_name<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> index_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">analyze_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>indexes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>index_stats<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Store statistics in system catalog</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">store_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stats<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">compute_column_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> col_name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> null_count <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> null_fraction <span class="token operator">=</span> null_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_null_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> non_null_values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token comment">// Compute most common values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> value_counts<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">usize</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span>non_null_values <span class="token punctuation">&#123;</span>            <span class="token operator">*</span>value_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> most_common<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> value_counts<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        most_common<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> b<span class="token number">.1</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>a<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mcv<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span> <span class="token operator">=</span> most_common            <span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">take</span><span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span>  <span class="token comment">// Keep top 10</span>            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">(</span>v<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> c <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> non_null_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Compute histogram</span>        <span class="token keyword">let</span> histogram <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">compute_histogram</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_null_values<span class="token punctuation">,</span> distinct_count<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Min/Max</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">if</span> non_null_values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">,</span> <span class="token class-name">None</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> sorted <span class="token operator">=</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> non_null_values<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            sorted<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>sorted<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>sorted<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>            column_name<span class="token punctuation">:</span> col_name<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            null_fraction<span class="token punctuation">,</span>            distinct_count<span class="token punctuation">,</span>            most_common_values<span class="token punctuation">:</span> mcv<span class="token punctuation">,</span>            histogram<span class="token punctuation">,</span>            min_value<span class="token punctuation">:</span> min<span class="token punctuation">,</span>            max_value<span class="token punctuation">:</span> max<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">compute_histogram</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Histogram</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">const</span> <span class="token constant">NUM_BUCKETS</span><span class="token punctuation">:</span> <span class="token keyword">usize</span> <span class="token operator">=</span> <span class="token number">100</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Use equi-depth histogram for better selectivity estimation</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket_size <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> <span class="token constant">NUM_BUCKETS</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> bucket_size <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span><span class="token constant">NUM_BUCKETS</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> start <span class="token operator">=</span> i <span class="token operator">*</span> bucket_size<span class="token punctuation">;</span>            <span class="token keyword">let</span> end <span class="token operator">=</span> <span class="token keyword">if</span> i <span class="token operator">==</span> <span class="token constant">NUM_BUCKETS</span> <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">&#123;</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">*</span> bucket_size <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Bucket</span> <span class="token punctuation">&#123;</span>                lower_bound<span class="token punctuation">:</span> sorted<span class="token punctuation">[</span>start<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                upper_bound<span class="token punctuation">:</span> sorted<span class="token punctuation">[</span>end <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                row_count<span class="token punctuation">:</span> <span class="token punctuation">(</span>end <span class="token operator">-</span> start<span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>                distinct_count<span class="token punctuation">:</span> distinct_count <span class="token operator">/</span> <span class="token constant">NUM_BUCKETS</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">EquiDepth</span> <span class="token punctuation">&#123;</span> buckets <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="使用統計資料：ANALYZE-命令">使用統計資料：ANALYZE 命令</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Analyze a single table</span><span class="token keyword">ANALYZE</span> users<span class="token punctuation">;</span><span class="token comment">-- Analyze specific columns</span><span class="token keyword">ANALYZE</span> users <span class="token punctuation">(</span>id<span class="token punctuation">,</span> balance<span class="token punctuation">,</span> created_at<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token comment">-- Analyze all tables</span><span class="token keyword">ANALYZE</span><span class="token punctuation">;</span><span class="token comment">-- Configure sampling (for large tables)</span><span class="token keyword">ANALYZE</span> users <span class="token keyword">WITH</span> SAMPLE <span class="token number">0.1</span><span class="token punctuation">;</span>  <span class="token comment">-- 10% sample</span></code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs (extended)</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Statement</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// ... existing statements ...</span>    <span class="token class-name">Analyze</span><span class="token punctuation">(</span><span class="token class-name">AnalyzeStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AnalyzeStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">ObjectName</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// src/optimizer/analyzer.rs</span><span class="token keyword">impl</span> <span class="token class-name">Database</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> statement<span class="token punctuation">:</span> <span class="token class-name">AnalyzeStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span> <span class="token operator">=</span> statement<span class="token punctuation">.</span>table <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Analyzed table &#123;&#125;: &#123;&#125; rows, &#123;&#125; pages"</span><span class="token punctuation">,</span>                      table<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>row_count<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>page_count<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Analyze all tables</span>            <span class="token keyword">for</span> table <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">get_all_tables</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Analyzed table &#123;&#125;: &#123;&#125; rows"</span><span class="token punctuation">,</span> table<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>row_count<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="3-成本模型">3 成本模型</h2><h3 id="成本公式">成本公式</h3><pre class="language-none"><code class="language-none">Total Cost &#x3D; CPU Cost + I&#x2F;O Cost + Memory CostWhere:- CPU Cost: Operations per row × number of rows- I&#x2F;O Cost: Pages read&#x2F;written × page cost- Memory Cost: Sort&#x2F;hash memory × memory cost factor</code></pre><hr /><h3 id="運算子成本模型">運算子成本模型</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/cost_model.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Cost constants (tunable)</span>    <span class="token keyword">pub</span> seq_page_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Cost of sequential page read</span>    <span class="token keyword">pub</span> random_page_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>   <span class="token comment">// Cost of random page read</span>    <span class="token keyword">pub</span> cpu_tuple_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>     <span class="token comment">// CPU cost per tuple</span>    <span class="token keyword">pub</span> cpu_index_tuple_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// CPU cost per index tuple</span>    <span class="token keyword">pub</span> cpu_operator_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// CPU cost per operator evaluation</span>    <span class="token keyword">pub</span> memory_cost_per_kb<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span> <span class="token comment">// Memory cost per KB</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Default</span> <span class="token keyword">for</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            seq_page_cost<span class="token punctuation">:</span> <span class="token number">1.0</span><span class="token punctuation">,</span>            random_page_cost<span class="token punctuation">:</span> <span class="token number">4.0</span><span class="token punctuation">,</span>  <span class="token comment">// Random I/O is ~4x slower</span>            cpu_tuple_cost<span class="token punctuation">:</span> <span class="token number">0.01</span><span class="token punctuation">,</span>            cpu_index_tuple_cost<span class="token punctuation">:</span> <span class="token number">0.005</span><span class="token punctuation">,</span>            cpu_operator_cost<span class="token punctuation">:</span> <span class="token number">0.0025</span><span class="token punctuation">,</span>            memory_cost_per_kb<span class="token punctuation">:</span> <span class="token number">0.001</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Cost of sequential scan</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">seq_scan_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        num_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> io_cost <span class="token operator">=</span> num_pages <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>seq_page_cost<span class="token punctuation">;</span>        <span class="token keyword">let</span> cpu_cost <span class="token operator">=</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_tuple_cost<span class="token punctuation">;</span>                <span class="token keyword">let</span> filter_cost <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>_filter<span class="token punctuation">)</span> <span class="token operator">=</span> filter <span class="token punctuation">&#123;</span>            num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token number">0.0</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> <span class="token number">0.0</span><span class="token punctuation">,</span>            total<span class="token punctuation">:</span> io_cost <span class="token operator">+</span> cpu_cost <span class="token operator">+</span> filter_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_rows_after_filter</span><span class="token punctuation">(</span>num_rows<span class="token punctuation">,</span> filter<span class="token punctuation">)</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Would be computed from schema</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of index scan</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">index_scan_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>        table_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexCondition</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Estimate how many index pages we need to read</span>        <span class="token keyword">let</span> index_selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_index_selectivity</span><span class="token punctuation">(</span>condition<span class="token punctuation">,</span> index<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> index_pages_to_read <span class="token operator">=</span> <span class="token punctuation">(</span>index<span class="token punctuation">.</span>leaf_pages <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>                <span class="token comment">// Estimate how many table pages we need to read</span>        <span class="token keyword">let</span> table_pages_to_read <span class="token operator">=</span> <span class="token keyword">if</span> index_selectivity <span class="token operator">></span> <span class="token number">0.3</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// High selectivity → sequential scan of table</span>            table_pages        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Low selectivity → random access</span>            <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> io_cost <span class="token operator">=</span> index_pages_to_read <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>random_page_cost            <span class="token operator">+</span> table_pages_to_read <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>random_page_cost<span class="token punctuation">;</span>                <span class="token keyword">let</span> cpu_cost <span class="token operator">=</span> <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_index_tuple_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> <span class="token number">0.0</span><span class="token punctuation">,</span>            total<span class="token punctuation">:</span> io_cost <span class="token operator">+</span> cpu_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of nested loop join</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">nested_loop_join_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        outer_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        inner_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        join_selectivity<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Outer is scanned once</span>        <span class="token keyword">let</span> outer_total <span class="token operator">=</span> outer_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Inner is scanned once per outer row</span>        <span class="token keyword">let</span> inner_total <span class="token operator">=</span> inner_cost<span class="token punctuation">.</span>total <span class="token operator">*</span> outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                <span class="token comment">// CPU cost for join condition evaluation</span>        <span class="token keyword">let</span> join_cpu_cost <span class="token operator">=</span> outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> inner_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span>             <span class="token operator">*</span> join_selectivity <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> outer_cost<span class="token punctuation">.</span>startup<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> outer_total <span class="token operator">+</span> inner_total <span class="token operator">+</span> join_cpu_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token punctuation">(</span>outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> inner_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> join_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of hash join</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">hash_join_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        left_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        right_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        join_selectivity<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Build phase: scan and hash the smaller relation</span>        <span class="token keyword">let</span> build_cost <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>        <span class="token keyword">let</span> build_memory <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">32.0</span><span class="token punctuation">;</span>  <span class="token comment">// Estimate 32 bytes per row</span>                <span class="token comment">// Probe phase: scan the larger relation and probe hash table</span>        <span class="token keyword">let</span> probe_cost <span class="token operator">=</span> right_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>        <span class="token keyword">let</span> probe_cpu <span class="token operator">=</span> right_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>                <span class="token comment">// Output cost</span>        <span class="token keyword">let</span> output_rows <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> right_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> join_selectivity<span class="token punctuation">;</span>        <span class="token keyword">let</span> output_cpu <span class="token operator">=</span> output_rows <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_tuple_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> build_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> probe_cost <span class="token operator">+</span> probe_cpu <span class="token operator">+</span> output_cpu<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> output_rows<span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of sort</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">sort_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        input_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        sort_keys<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">SortKey</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> input_total <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Check if sort fits in memory</span>        <span class="token keyword">let</span> sort_memory <span class="token operator">=</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">64.0</span><span class="token punctuation">;</span>  <span class="token comment">// Estimate 64 bytes per row</span>        <span class="token keyword">let</span> work_mem <span class="token operator">=</span> <span class="token number">4</span> <span class="token operator">*</span> <span class="token number">1024</span> <span class="token operator">*</span> <span class="token number">1024.0</span><span class="token punctuation">;</span>  <span class="token comment">// 4MB work memory</span>                <span class="token keyword">let</span> sort_cpu <span class="token operator">=</span> <span class="token keyword">if</span> sort_memory <span class="token operator">&lt;=</span> work_mem <span class="token punctuation">&#123;</span>            <span class="token comment">// In-memory sort: O(n log n)</span>            num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_rows<span class="token punctuation">.</span><span class="token function">log2</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// External sort: 2 passes</span>            <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_rows<span class="token punctuation">.</span><span class="token function">log2</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token number">2.0</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost                <span class="token operator">+</span> sort_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> cpu_per_key <span class="token operator">=</span> sort_keys<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> input_total <span class="token operator">+</span> sort_cpu <span class="token operator">+</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> cpu_per_key<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> input_total <span class="token operator">+</span> sort_cpu <span class="token operator">+</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> cpu_per_key<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> num_rows<span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of hash aggregate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">hash_aggregate_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        input_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        num_groups<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        num_aggregates<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> input_total <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Build hash table of groups</span>        <span class="token keyword">let</span> build_memory <span class="token operator">=</span> num_groups <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">64.0</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> build_cpu <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>                <span class="token comment">// Aggregate computation</span>        <span class="token keyword">let</span> aggregate_cpu <span class="token operator">=</span> num_groups <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_aggregates <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> input_total <span class="token operator">+</span> build_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> input_total <span class="token operator">+</span> build_cpu <span class="token operator">+</span> aggregate_cpu<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> num_groups<span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_rows_after_filter</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span> filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> filter <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span> <span class="token operator">=></span> num_rows<span class="token punctuation">,</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Simplified selectivity estimation</span>        <span class="token comment">// In practice, this would use statistics and histograms</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token number">0.01</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1% match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">|</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token number">0.33</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1/3 match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">|</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1/2 match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token number">0.1</span><span class="token punctuation">,</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_index_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> condition<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexCondition</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> condition <span class="token punctuation">&#123;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">(</span>_<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Equality: 1 / distinct_keys</span>                <span class="token number">1.0</span> <span class="token operator">/</span> index<span class="token punctuation">.</span>distinct_keys<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Range: estimate 10% of index</span>                <span class="token number">0.1</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">InList</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// IN list: |values| / distinct_keys</span>                values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> index<span class="token punctuation">.</span>distinct_keys<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Cost</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> startup<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token comment">// Cost to return first row</span>    <span class="token keyword">pub</span> total<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Cost to return all rows</span>    <span class="token keyword">pub</span> rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>       <span class="token comment">// Estimated output rows</span>    <span class="token keyword">pub</span> width<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token comment">// Estimated row width in bytes</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-帶動態規劃的連線順序">4 帶動態規劃的連線順序</h2><h3 id="連線順序問題">連線順序問題</h3><p><strong>對於 n 個表，有 (n-1)! 種可能的連線順序：</strong></p><pre class="language-none"><code class="language-none">3 tables: 2! &#x3D; 2 orders5 tables: 4! &#x3D; 24 orders10 tables: 9! &#x3D; 362,880 orders</code></pre><p><strong>暴力破解是不可能的。</strong> 我們需要動態規劃。</p><hr /><h3 id="DP-連線順序演算法">DP 連線順序演算法</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/join_order.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">JoinOrderOptimizer</span> <span class="token punctuation">&#123;</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">JoinOrderOptimizer</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Find the best join order using dynamic programming</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">optimize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> tables<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">String</span><span class="token punctuation">]</span><span class="token punctuation">,</span> conditions<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">JoinCondition</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> n <span class="token operator">=</span> tables<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// dp[i] = best plan for subset represented by bitmask i</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> dp<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token keyword">u64</span><span class="token punctuation">,</span> <span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> costs<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token keyword">u64</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Base case: single table scans</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>i<span class="token punctuation">,</span> table<span class="token punctuation">)</span> <span class="token keyword">in</span> tables<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">enumerate</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> mask <span class="token operator">=</span> <span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">;</span>            <span class="token keyword">let</span> plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_scan_plan</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                        dp<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>mask<span class="token punctuation">,</span> plan<span class="token punctuation">)</span><span class="token punctuation">;</span>            costs<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>mask<span class="token punctuation">,</span> cost<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Build up larger subsets</span>        <span class="token keyword">for</span> size <span class="token keyword">in</span> <span class="token number">2</span><span class="token punctuation">..=</span>n <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> subset <span class="token keyword">in</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subsets_of_size</span><span class="token punctuation">(</span>n<span class="token punctuation">,</span> size<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> subset_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>subset<span class="token punctuation">)</span><span class="token punctuation">;</span>                                <span class="token comment">// Try all ways to split this subset</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_plan<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">None</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_cost <span class="token operator">=</span> <span class="token keyword">f64</span><span class="token punctuation">::</span><span class="token constant">INFINITY</span><span class="token punctuation">;</span>                                <span class="token keyword">for</span> split <span class="token keyword">in</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">split_subset</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>subset<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">let</span> left_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>split<span class="token number">.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">let</span> right_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>split<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                                        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>left_plan<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>right_plan<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=</span>                         <span class="token punctuation">(</span>dp<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left_mask<span class="token punctuation">)</span><span class="token punctuation">,</span> dp<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>right_mask<span class="token punctuation">)</span><span class="token punctuation">)</span>                     <span class="token punctuation">&#123;</span>                        <span class="token comment">// Try different join algorithms</span>                        <span class="token keyword">for</span> join_plan <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_join_plans</span><span class="token punctuation">(</span>                            left_plan<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                            right_plan<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                            conditions<span class="token punctuation">,</span>                        <span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>join_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                            <span class="token keyword">if</span> cost <span class="token operator">&lt;</span> best_cost <span class="token punctuation">&#123;</span>                                best_cost <span class="token operator">=</span> cost<span class="token punctuation">;</span>                                best_plan <span class="token operator">=</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>join_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                            <span class="token punctuation">&#125;</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span> <span class="token operator">=</span> best_plan <span class="token punctuation">&#123;</span>                    dp<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>subset_mask<span class="token punctuation">,</span> plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                    costs<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>subset_mask<span class="token punctuation">,</span> best_cost<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Return the best plan for all tables</span>        <span class="token keyword">let</span> all_mask <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> n<span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">;</span>        dp<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>all_mask<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_scan_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if we have useful indexes</span>        <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>index<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_useful_index</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token operator">&amp;</span>stats<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span>                table<span class="token punctuation">:</span> table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                alias<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>                index<span class="token punctuation">:</span> index<span class="token punctuation">.</span>name<span class="token punctuation">,</span>                columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// All columns</span>                condition<span class="token punctuation">:</span> <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span> high<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span>                table<span class="token punctuation">:</span> table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                alias<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>                columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span>                filter<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_join_plans</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        left<span class="token punctuation">:</span> <span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span>        conditions<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">JoinCondition</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plans <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Get join condition</span>        <span class="token keyword">let</span> condition <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_join_condition</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left<span class="token punctuation">,</span> <span class="token operator">&amp;</span>right<span class="token punctuation">,</span> conditions<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Nested loop join (always possible)</span>        plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">NestedLoopJoin</span> <span class="token punctuation">&#123;</span>            left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Hash join (if equi-join)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_equi_join</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>condition<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span>                left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Merge join (if inputs can be sorted on join keys)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">can_merge_join</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left<span class="token punctuation">,</span> <span class="token operator">&amp;</span>right<span class="token punctuation">,</span> <span class="token operator">&amp;</span>condition<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">MergeJoin</span> <span class="token punctuation">&#123;</span>                left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                plans    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">subsets_of_size</span><span class="token punctuation">(</span>n<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">>></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Generate all subsets of &#123;0, 1, ..., n-1&#125; with given size</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> result <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">generate_subsets</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> n<span class="token punctuation">,</span> size<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> result<span class="token punctuation">)</span><span class="token punctuation">;</span>        result    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">generate_subsets</span><span class="token punctuation">(</span>        start<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        n<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        current<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span>        result<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> current<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> size <span class="token punctuation">&#123;</span>            result<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>current<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">return</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">for</span> i <span class="token keyword">in</span> start<span class="token punctuation">..</span>n <span class="token punctuation">&#123;</span>            current<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>i<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">generate_subsets</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> n<span class="token punctuation">,</span> size<span class="token punctuation">,</span> current<span class="token punctuation">,</span> result<span class="token punctuation">)</span><span class="token punctuation">;</span>            current<span class="token punctuation">.</span><span class="token function">pop</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">split_subset</span><span class="token punctuation">(</span>subset<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Generate all non-empty proper splits of the subset</span>        <span class="token keyword">let</span> n <span class="token operator">=</span> subset<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> splits <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Use bitmask to generate all splits</span>        <span class="token keyword">for</span> mask <span class="token keyword">in</span> <span class="token number">1</span><span class="token punctuation">..</span><span class="token punctuation">(</span><span class="token number">1</span> <span class="token operator">&lt;&lt;</span> <span class="token punctuation">(</span>n <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> right <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>n <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> i <span class="token operator">==</span> n <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">&#123;</span>                    right<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> mask <span class="token operator">&amp;</span> <span class="token punctuation">(</span><span class="token number">1</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    right<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>                        <span class="token keyword">if</span> <span class="token operator">!</span>left<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span>right<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                splits<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> right<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                splits    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">subset_to_mask</span><span class="token punctuation">(</span>subset<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> mask <span class="token operator">=</span> <span class="token number">0u64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> <span class="token operator">&amp;</span>i <span class="token keyword">in</span> subset <span class="token punctuation">&#123;</span>            mask <span class="token operator">|=</span> <span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        mask    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="連線順序範例">連線順序範例</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> <span class="token operator">*</span><span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">JOIN</span> products p <span class="token keyword">ON</span> o<span class="token punctuation">.</span>product_id <span class="token operator">=</span> p<span class="token punctuation">.</span>id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>balance <span class="token operator">></span> <span class="token number">100</span></code></pre><p><strong>動態規劃進度：</strong></p><pre class="language-none"><code class="language-none">Iteration 1 (single tables):  &#123;users&#125;: SeqScan cost&#x3D;100, rows&#x3D;10000  &#123;orders&#125;: IndexScan cost&#x3D;50, rows&#x3D;50000  &#123;products&#125;: SeqScan cost&#x3D;10, rows&#x3D;1000Iteration 2 (two tables):  &#123;users, orders&#125;:     - users ⋈ orders (hash): cost&#x3D;600, rows&#x3D;5000    - orders ⋈ users (nested): cost&#x3D;800, rows&#x3D;5000    → Best: HashJoin cost&#x3D;600      &#123;orders, products&#125;:    - orders ⋈ products (hash): cost&#x3D;200, rows&#x3D;10000    → Best: HashJoin cost&#x3D;200Iteration 3 (three tables):  &#123;users, orders, products&#125;:    - &#123;users, orders&#125; ⋈ products: cost&#x3D;800, rows&#x3D;1000    - &#123;orders, products&#125; ⋈ users: cost&#x3D;700, rows&#x3D;1000 ← Best!    - users ⋈ &#123;orders, products&#125;: cost&#x3D;900, rows&#x3D;1000    → Best: (orders ⋈ products) ⋈ users</code></pre><p><strong>最終計劃：</strong></p><pre class="language-none"><code class="language-none">HashAggregate  └─ HashJoin (u.id &#x3D; o.user_id)      ├─ SeqScan (users) [balance &gt; 100]      └─ HashJoin (o.product_id &#x3D; p.id)          ├─ IndexScan (orders)          └─ SeqScan (products)</code></pre><hr /><h2 id="5-索引選擇">5 索引選擇</h2><h3 id="選擇正確的索引">選擇正確的索引</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/index_selector.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexSelector</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">IndexSelector</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Find the best index for a query</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">select_index</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">IndexSelection</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> indexes <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table_indexes</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_index<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">IndexSelection</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> best_score <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> index <span class="token keyword">in</span> indexes <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> score <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">score_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> score <span class="token operator">></span> best_score <span class="token punctuation">&#123;</span>                best_score <span class="token operator">=</span> score<span class="token punctuation">;</span>                best_index <span class="token operator">=</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">IndexSelection</span> <span class="token punctuation">&#123;</span>                    index<span class="token punctuation">:</span> index<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    score<span class="token punctuation">,</span>                    usable_predicates<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_usable_predicates</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                best_index    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">score_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> score <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>                <span class="token comment">// Check if index columns are used in predicates</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>i<span class="token punctuation">,</span> col<span class="token punctuation">)</span> <span class="token keyword">in</span> index<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">enumerate</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> predicate <span class="token keyword">in</span> predicates <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_uses_column</span><span class="token punctuation">(</span>predicate<span class="token punctuation">,</span> col<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Earlier columns in index are more valuable</span>                    <span class="token keyword">let</span> position_weight <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">/</span> <span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                    score <span class="token operator">+=</span> position_weight <span class="token operator">*</span> <span class="token number">100.0</span><span class="token punctuation">;</span>                                        <span class="token comment">// Equality is more valuable than range</span>                    <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_equality_predicate</span><span class="token punctuation">(</span>predicate<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                        score <span class="token operator">*=</span> <span class="token number">2.0</span><span class="token punctuation">;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Bonus for covering indexes (all columns in index)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_covering_index</span><span class="token punctuation">(</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            score <span class="token operator">*=</span> <span class="token number">1.5</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Bonus for unique indexes</span>        <span class="token keyword">if</span> index<span class="token punctuation">.</span>is_unique <span class="token punctuation">&#123;</span>            score <span class="token operator">*=</span> <span class="token number">1.3</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                score    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">find_usable_predicates</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        predicates            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>p<span class="token closure-punctuation punctuation">|</span></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_uses_index_column</span><span class="token punctuation">(</span>p<span class="token punctuation">,</span> index<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">predicate_uses_column</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicate<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> predicate <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expr_references_column</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> column<span class="token punctuation">)</span> <span class="token operator">||</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expr_references_column</span><span class="token punctuation">(</span>right<span class="token punctuation">,</span> column<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">)</span> <span class="token operator">=></span> ident<span class="token punctuation">.</span>value <span class="token operator">==</span> column<span class="token punctuation">,</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span>idents<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                idents<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">any</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>i<span class="token closure-punctuation punctuation">|</span></span> i<span class="token punctuation">.</span>value <span class="token operator">==</span> column<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token boolean">false</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">is_equality_predicate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicate<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> predicate <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">matches!</span><span class="token punctuation">(</span>op<span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token boolean">false</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">is_covering_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if all columns referenced in predicates are in the index</span>        <span class="token keyword">let</span> referenced_columns <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">extract_referenced_columns</span><span class="token punctuation">(</span>predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>        referenced_columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">all</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>col<span class="token closure-punctuation punctuation">|</span></span> index<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span>col<span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexSelection</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> index<span class="token punctuation">:</span> <span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> score<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> usable_predicates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-完整的最佳化管線">6 完整的最佳化管線</h2><h3 id="從-AST-到物理計劃">從 AST 到物理計劃</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/optimizer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">QueryOptimizer</span> <span class="token punctuation">&#123;</span>    catalog<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Catalog</span><span class="token operator">></span><span class="token punctuation">,</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span>    join_optimizer<span class="token punctuation">:</span> <span class="token class-name">JoinOrderOptimizer</span><span class="token punctuation">,</span>    index_selector<span class="token punctuation">:</span> <span class="token class-name">IndexSelector</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">QueryOptimizer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">optimize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> ast<span class="token punctuation">:</span> <span class="token class-name">SelectStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Phase 1: Create logical plan</span>        <span class="token keyword">let</span> logical_plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_logical_plan</span><span class="token punctuation">(</span>ast<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 2: Apply logical optimizations</span>        <span class="token keyword">let</span> optimized_logical <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">apply_logical_optimizations</span><span class="token punctuation">(</span>logical_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 3: Generate physical plans</span>        <span class="token keyword">let</span> physical_plans <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">generate_physical_plans</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>optimized_logical<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 4: Choose best plan based on cost</span>        <span class="token keyword">let</span> best_plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">choose_best_plan</span><span class="token punctuation">(</span>physical_plans<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>best_plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_logical_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> ast<span class="token punctuation">:</span> <span class="token class-name">SelectStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Start with FROM clause</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_table_with_joins</span><span class="token punctuation">(</span>ast<span class="token punctuation">.</span>from<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Add WHERE filter</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>where_clause<span class="token punctuation">)</span> <span class="token operator">=</span> ast<span class="token punctuation">.</span>where_clause <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Filter</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                predicate<span class="token punctuation">:</span> where_clause<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add GROUP BY / aggregates</span>        <span class="token keyword">if</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>group_by<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>having<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_aggregate</span><span class="token punctuation">(</span>plan<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>group_by<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>having<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add SELECT projection</span>        plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_projection</span><span class="token punctuation">(</span>plan<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>projections<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Add DISTINCT</span>        <span class="token keyword">if</span> ast<span class="token punctuation">.</span>distinct <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Distinct</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add ORDER BY</span>        <span class="token keyword">if</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>order_by<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                order_by<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>order_by<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add LIMIT/OFFSET</span>        <span class="token keyword">if</span> ast<span class="token punctuation">.</span>limit<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ast<span class="token punctuation">.</span>offset<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                limit<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>limit<span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">::</span><span class="token constant">MAX</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                offset<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>offset<span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">apply_logical_optimizations</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> plan<span class="token punctuation">:</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">LogicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> optimized <span class="token operator">=</span> plan<span class="token punctuation">;</span>                <span class="token comment">// Predicate pushdown</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_pushdown</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Projection pruning</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">projection_pruning</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Constant folding</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">constant_folding</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Subquery unnesting</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">unnest_subqueries</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                optimized    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">generate_physical_plans</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> logical<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">LogicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plans <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Generate all reasonable physical alternatives</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">generate_plans_recursive</span><span class="token punctuation">(</span>logical<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> plans<span class="token punctuation">)</span><span class="token punctuation">;</span>                plans    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">choose_best_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> plans<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> plans<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">OptimizerError</span><span class="token punctuation">::</span><span class="token class-name">NoPlansGenerated</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_plan <span class="token operator">=</span> plans<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> best_cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cost_model<span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>best_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> plan <span class="token keyword">in</span> plans<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">skip</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cost_model<span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> cost <span class="token operator">&lt;</span> best_cost <span class="token punctuation">&#123;</span>                best_cost <span class="token operator">=</span> cost<span class="token punctuation">;</span>                best_plan <span class="token operator">=</span> plan<span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>best_plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-建構的挑戰">7 用 Rust 建構的挑戰</h2><h3 id="挑戰-1：遞歸計劃類型">挑戰 1：遞歸計劃類型</h3><p><strong>問題：</strong> PhysicalPlan 是深度遞歸的，難以模式匹配。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Complex nested matching</span><span class="token keyword">match</span> plan <span class="token punctuation">&#123;</span>    <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> left<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：訪問者模式</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean traversal</span><span class="token keyword">pub</span> <span class="token keyword">trait</span> <span class="token type-definition class-name">PlanVisitor</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">visit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> plan<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">collect_scan_tables</span><span class="token punctuation">(</span>plan<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> visitor <span class="token operator">=</span> <span class="token class-name">TableCollector</span> <span class="token punctuation">&#123;</span> tables<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>    visitor<span class="token punctuation">.</span><span class="token function">visit</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>    visitor<span class="token punctuation">.</span>tables<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-2：成本類型精度">挑戰 2：成本類型精度</h3><p><strong>問題：</strong> 成本可能非常大或非常小。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ f32 loses precision</span><span class="token keyword">let</span> cost<span class="token punctuation">:</span> <span class="token keyword">f32</span> <span class="token operator">=</span> <span class="token number">1000000.0</span> <span class="token operator">+</span> <span class="token number">0.0001</span><span class="token punctuation">;</span>  <span class="token comment">// Loses 0.0001!</span></code></pre><p><strong>解決方案：使用 f64</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Better precision</span><span class="token keyword">let</span> cost<span class="token punctuation">:</span> <span class="token keyword">f64</span> <span class="token operator">=</span> <span class="token number">1000000.0</span> <span class="token operator">+</span> <span class="token number">0.0001</span><span class="token punctuation">;</span>  <span class="token comment">// Preserves both</span></code></pre><hr /><h3 id="挑戰-3：統計資料生命週期">挑戰 3：統計資料生命週期</h3><p><strong>問題：</strong> 統計資料需要在最佳化過程中共享。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Optimizer</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">StatisticsCatalog</span><span class="token punctuation">,</span>  <span class="token comment">// Too large to clone</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：Arc 用於共享所有權</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Optimizer</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-AI-如何加速這項工作">8 AI 如何加速這項工作</h2><h3 id="AI-做對了什麼">AI 做對了什麼</h3><table><thead><tr><th>任務</th><th>AI 貢獻</th></tr></thead><tbody><tr><td><strong>成本模型結構</strong></td><td>CPU/IO/記憶體成本的良好分解</td></tr><tr><td><strong>DP 連線順序</strong></td><td>正確的基於位元遮罩的子集產生</td></tr><tr><td><strong>統計資料設計</strong></td><td>直方圖類型、MCV 串列</td></tr><tr><td><strong>索引評分</strong></td><td>位置權重、相等獎金</td></tr></tbody></table><hr /><h3 id="AI-做錯了什麼">AI 做錯了什麼</h3><table><thead><tr><th>問題</th><th>發生什麼事</th></tr></thead><tbody><tr><td><strong>選擇性估計</strong></td><td>初稿使用固定值，不是直方圖</td></tr><tr><td><strong>連線成本公式</strong></td><td>忽略了內部每個外部列掃描一次</td></tr><tr><td><strong>排序成本</strong></td><td>沒有區分記憶體內與外部排序</td></tr><tr><td><strong>覆蓋索引</strong></td><td>初始設計沒有考慮僅索引掃描</td></tr></tbody></table><p><strong>模式：</strong> AI 處理結構良好。數值公式和邊界情況需要手動驗證。</p><hr /><h3 id="範例：除錯連線成本">範例：除錯連線成本</h3><p><strong>我問 AI 的問題：</strong></p><blockquote><p>“Hash join 成本似乎錯誤。建立雜湊表應該是啟動成本，不是總成本。”</p></blockquote><p><strong>我學到的：</strong></p><ol><li><strong>啟動成本：</strong> 返回第一列的成本</li><li><strong>總成本：</strong> 返回所有列的成本</li><li>Hash 建立是啟動（必須在探測前完成）</li><li>探測成本隨輸出行數擴展</li></ol><p><strong>結果：</strong> 修復成本模型：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>    startup<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> build_memory <span class="token operator">*</span> memory_factor<span class="token punctuation">,</span>  <span class="token comment">// Before first row</span>    total<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> probe_cost <span class="token operator">+</span> output_cpu<span class="token punctuation">,</span>         <span class="token comment">// All rows</span>    rows<span class="token punctuation">:</span> output_rows<span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="總結：查詢最佳化器一張圖">總結：查詢最佳化器一張圖</h2><pre class="language-MERMAID_BASE64_607" data-language="MERMAID_BASE64_607"><code class="language-MERMAID_BASE64_607">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiSW5wdXQiCiAgICAgICAgQVtTUUwgQVNUXSAtLT4gQltTdGF0aXN0aWNzIENhdGFsb2ddCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIkxvZ2ljYWwgT3B0aW1pemF0aW9uIgogICAgICAgIENbQ3JlYXRlIExvZ2ljYWwgUGxhbl0gLS0+IERbUHJlZGljYXRlIFB1c2hkb3duXQogICAgICAgIEQgLS0+IEVbUHJvamVjdGlvbiBQcnVuaW5nXQogICAgICAgIEUgLS0+IEZbQ29uc3RhbnQgRm9sZGluZ10KICAgICAgICBGIC0tPiBHW1N1YnF1ZXJ5IFVubmVzdGluZ10KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiUGh5c2ljYWwgT3B0aW1pemF0aW9uIgogICAgICAgIEcgLS0+IEhbR2VuZXJhdGUgUGh5c2ljYWwgUGxhbnNdCiAgICAgICAgSCAtLT4gSVtTZXFTY2FuLCBJbmRleFNjYW5dCiAgICAgICAgSCAtLT4gSltOZXN0ZWRMb29wLCBIYXNoLCBNZXJnZSBKb2luXQogICAgICAgIEggLS0+IEtbU29ydCwgSGFzaEFnZ3JlZ2F0ZV0KICAgICAgICBJICYgSiAmIEsgLS0+IExbQ29zdCBFc3RpbWF0aW9uXQogICAgICAgIEwgLS0+IE1bQ2hvb3NlIEJlc3QgUGxhbl0KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiU3RhdGlzdGljcyIKICAgICAgICBCIC0tPiBOW1RhYmxlIFN0YXRzOiByb3dfY291bnQsIHBhZ2VzXQogICAgICAgIEIgLS0+IE9bQ29sdW1uIFN0YXRzOiBoaXN0b2dyYW0sIE1DVl0KICAgICAgICBCIC0tPiBQW0luZGV4IFN0YXRzOiBkaXN0aW5jdF9rZXlzXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJPdXRwdXQiCiAgICAgICAgTSAtLT4gUVtQaHlzaWNhbCBQbGFuXQogICAgZW5kCiAgICAKICAgIHN0eWxlIEEgZmlsbDojZTNmMmZkLHN0cm9rZTojMTk3NmQyCiAgICBzdHlsZSBRIGZpbGw6I2U4ZjVlOSxzdHJva2U6IzM4OGUzYwogICAgc3R5bGUgTCBmaWxsOiNmZmYzZTAsc3Ryb2tlOiNmNTdjMDA&#x3D;</code></pre><p><strong>關鍵要點：</strong></p><table><thead><tr><th>概念</th><th>為什麼重要</th></tr></thead><tbody><tr><td><strong>邏輯 vs. 物理</strong></td><td>分離 WHAT 與 HOW</td></tr><tr><td><strong>統計資料</strong></td><td>準確的成本估計需要資料</td></tr><tr><td><strong>成本模型</strong></td><td>CPU + I/O + memory = total cost</td></tr><tr><td><strong>DP 連線順序</strong></td><td>無需暴力破解找到最佳順序</td></tr><tr><td><strong>索引選擇</strong></td><td>為謂詞選擇最佳索引</td></tr><tr><td><strong>啟動 vs. 總計</strong></td><td>第一列延遲 vs. 吞吐量</td></tr></tbody></table><hr /><p><strong>進一步閱讀：</strong></p><ul><li>“Database Management Systems” by Ramakrishnan (Ch. 15: Query Optimization)</li><li>“Readings in Database Systems” (Red Book) - Query Optimization chapter</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/tree/master/src/backend/optimizer"><code>src/backend/optimizer/</code></a></li><li>“Cost-Based Oracle Fundamentals” by Jonathan Lewis</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第七部分：建構基於成本的查詢最佳化器。深入探討統計資料收集、成本模型、帶動態規劃的連線順序，以及索引選擇。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>Database in Rust: Cost-Based Query Optimizer with Statistics</title>
    <link href="https://neo01.com/2026/03/Database-Rust-Query-Optimizer/"/>
    <id>https://neo01.com/2026/03/Database-Rust-Query-Optimizer/</id>
    <published>2026-03-05T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:38.933Z</updated>
    
    <content type="html"><![CDATA[<p>In <a href="/2026/03/Database-Rust-SQL-Parser/">Part 6</a>, we built a SQL parser that produces ASTs. But there’s a problem.</p><p><strong>The same query can be executed many different ways:</strong></p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> u<span class="token punctuation">.</span>name<span class="token punctuation">,</span> o<span class="token punctuation">.</span>total<span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>balance <span class="token operator">></span> <span class="token number">100</span></code></pre><p><strong>Possible execution plans:</strong></p><pre class="language-none"><code class="language-none">Plan A:                          Plan B:                          Plan C:1. Scan users                    1. Scan orders                   1. Scan users (balance &gt; 100)2. Filter (balance &gt; 100)        2. Filter (exists in users)      2. Index lookup on orders3. Scan orders                   3. Scan users                    3. Hash join4. Hash join                     4. Nested loop join              4. Sort by name5. Sort                          5. SortCost: 1500                       Cost: 800                        Cost: 200 ← Best!</code></pre><p><strong>How do we find Plan C automatically?</strong></p><p>Today: building a cost-based query optimizer in Rust—with statistics, cost models, and dynamic programming for join ordering.</p><hr /><h2 id="1-Logical-vs-Physical-Plans">1 Logical vs. Physical Plans</h2><h3 id="The-Two-Phase-Approach">The Two-Phase Approach</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│              Query Optimization Pipeline                     │├─────────────────────────────────────────────────────────────┤│                                                              ││  SQL AST                                                     ││     │                                                        ││     ▼                                                        ││  ┌──────────────────────────────────────────────────────┐   ││  │  Logical Plan (WHAT to compute)                      │   ││  │  - Logical Scan: users                               │   ││  │  - Logical Filter: balance &gt; 100                     │   ││  │  - Logical Hash Join: u.id &#x3D; o.user_id               │   ││  └──────────────────────────────────────────────────────┘   ││     │                                                        ││     ▼ Optimization                                           ││                                                              ││  ┌──────────────────────────────────────────────────────┐   ││  │  Physical Plan (HOW to compute)                      │   ││  │  - Index Scan: users (balance &gt; 100)                 │   ││  │  - Index Scan: orders (user_id index)                │   ││  │  - Nested Loop Join                                  │   ││  └──────────────────────────────────────────────────────┘   ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="Logical-Plan-Operators">Logical Plan Operators</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/logical_plan.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">LogicalPlan</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Scan a table</span>    <span class="token class-name">TableScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        projection<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Column indices</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Filter rows</span>    <span class="token class-name">Filter</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        predicate<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Project columns/expressions</span>    <span class="token class-name">Projection</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        expressions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Join two relations</span>    <span class="token class-name">Join</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">JoinCondition</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Aggregate (GROUP BY)</span>    <span class="token class-name">Aggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Sort (ORDER BY)</span>    <span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SortKey</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Limit</span>    <span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        limit<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        offset<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Distinct</span>    <span class="token class-name">Distinct</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinCondition</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">On</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Using</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>    <span class="token class-name">RightOuter</span><span class="token punctuation">,</span>    <span class="token class-name">FullOuter</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SortKey</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> expression<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> ascending<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> nulls_first<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AggregateFunction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> func<span class="token punctuation">:</span> <span class="token class-name">AggregateFunc</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> argument<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">AggregateFunc</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Count</span><span class="token punctuation">,</span>    <span class="token class-name">Sum</span><span class="token punctuation">,</span>    <span class="token class-name">Avg</span><span class="token punctuation">,</span>    <span class="token class-name">Min</span><span class="token punctuation">,</span>    <span class="token class-name">Max</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Physical-Plan-Operators">Physical Plan Operators</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/physical_plan.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Full table scan</span>    <span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Index scan</span>    <span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>        columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">IndexCondition</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Nested loop join</span>    <span class="token class-name">NestedLoopJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Hash join</span>    <span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Merge join (requires sorted input)</span>    <span class="token class-name">MergeJoin</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>        join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Sort</span>    <span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SortKey</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Hash aggregate</span>    <span class="token class-name">HashAggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Stream aggregate (requires sorted input)</span>    <span class="token class-name">StreamAggregate</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        aggregates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">AggregateFunction</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">/// Limit</span>    <span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>        input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">,</span>        limit<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        offset<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">IndexCondition</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Eq</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span> high<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">InList</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="2-Statistics-Collection">2 Statistics Collection</h2><h3 id="Why-Statistics-Matter">Why Statistics Matter</h3><p><strong>Without statistics:</strong> All plans look the same.</p><p><strong>With statistics:</strong> We can estimate costs accurately.</p><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Query</span><span class="token keyword">SELECT</span> <span class="token operator">*</span> <span class="token keyword">FROM</span> users <span class="token keyword">WHERE</span> balance <span class="token operator">></span> <span class="token number">100</span><span class="token comment">-- Scenario A: balance is uniformly distributed 0-1000</span><span class="token comment">-- → ~90% of rows match → SeqScan is better</span><span class="token comment">-- Scenario B: balance is skewed, only 1% have > 100</span><span class="token comment">-- → IndexScan is better</span></code></pre><hr /><h3 id="Statistics-Structure">Statistics Structure</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/statistics.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> page_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> average_row_size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">ColumnStatistics</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> indexes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">IndexStatistics</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> last_analyzed<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> column_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> null_fraction<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Fraction of NULL values (0.0 - 1.0)</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>     <span class="token comment">// Number of distinct values</span>    <span class="token keyword">pub</span> most_common_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// (value, frequency)</span>    <span class="token keyword">pub</span> histogram<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Histogram</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> min_value<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_value<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Histogram</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Equi-width histogram (equal bucket sizes)</span>    <span class="token class-name">EquiWidth</span> <span class="token punctuation">&#123;</span>        buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Bucket</span><span class="token operator">></span><span class="token punctuation">,</span>        min<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>        max<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">/// Equi-depth histogram (equal rows per bucket)</span>    <span class="token class-name">EquiDepth</span> <span class="token punctuation">&#123;</span>        buckets<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Bucket</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Bucket</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lower_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> upper_bound<span class="token punctuation">:</span> <span class="token class-name">Value</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> row_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexStatistics</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> index_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> is_unique<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> is_primary<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> leaf_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> distinct_keys<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> average_leaf_per_key<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Collecting-Statistics">Collecting Statistics</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/analyzer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StatisticsAnalyzer</span> <span class="token punctuation">&#123;</span>    buffer_pool<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">BufferPool</span><span class="token operator">></span><span class="token punctuation">,</span>    storage<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StorageEngine</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StatisticsAnalyzer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">TableStatistics</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> stats <span class="token operator">=</span> <span class="token class-name">TableStatistics</span> <span class="token punctuation">&#123;</span>            table_name<span class="token punctuation">:</span> table_name<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            row_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            page_count<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            average_row_size<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            columns<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            indexes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            last_analyzed<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token punctuation">::</span><span class="token function">now</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Scan all pages to collect statistics</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> total_size <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> column_values<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Value</span><span class="token operator">>></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>storage<span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_name<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>page_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>            <span class="token keyword">for</span> row <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                stats<span class="token punctuation">.</span>row_count <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                total_size <span class="token operator">+=</span> row<span class="token punctuation">.</span><span class="token function">size</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Collect column values</span>                <span class="token keyword">for</span> <span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> value<span class="token punctuation">)</span> <span class="token keyword">in</span> row<span class="token punctuation">.</span><span class="token function">columns</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    column_values                        <span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>col_name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">.</span><span class="token function">or_insert_with</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token punctuation">::</span>new<span class="token punctuation">)</span>                        <span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">if</span> stats<span class="token punctuation">.</span>row_count <span class="token operator">></span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>            stats<span class="token punctuation">.</span>average_row_size <span class="token operator">=</span> total_size <span class="token operator">/</span> stats<span class="token punctuation">.</span>row_count <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Compute column statistics</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> values<span class="token punctuation">)</span> <span class="token keyword">in</span> column_values <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> col_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">compute_column_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>col_name<span class="token punctuation">,</span> <span class="token operator">&amp;</span>values<span class="token punctuation">)</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>col_name<span class="token punctuation">,</span> col_stats<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Collect index statistics</span>        <span class="token keyword">for</span> index <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>storage<span class="token punctuation">.</span><span class="token function">get_table_indexes</span><span class="token punctuation">(</span>table_name<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> index_stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">analyze_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            stats<span class="token punctuation">.</span>indexes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>index_stats<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Store statistics in system catalog</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">store_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stats<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">compute_column_statistics</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> col_name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> null_count <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> null_fraction <span class="token operator">=</span> null_count <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> non_null_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">!</span>v<span class="token punctuation">.</span><span class="token function">is_null</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> distinct_count <span class="token operator">=</span> non_null_values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashSet</span><span class="token operator">&lt;</span>_<span class="token operator">>></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>        <span class="token comment">// Compute most common values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> value_counts<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">usize</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span>non_null_values <span class="token punctuation">&#123;</span>            <span class="token operator">*</span>value_counts<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> most_common<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span>_<span class="token operator">></span> <span class="token operator">=</span> value_counts<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        most_common<span class="token punctuation">.</span><span class="token function">sort_by</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>a<span class="token punctuation">,</span> b<span class="token closure-punctuation punctuation">|</span></span> b<span class="token number">.1</span><span class="token punctuation">.</span><span class="token function">cmp</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>a<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> mcv<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Value</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token operator">></span> <span class="token operator">=</span> most_common            <span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">take</span><span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span>  <span class="token comment">// Keep top 10</span>            <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>v<span class="token punctuation">,</span> c<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">(</span>v<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> c <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> non_null_values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Compute histogram</span>        <span class="token keyword">let</span> histogram <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">compute_histogram</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>non_null_values<span class="token punctuation">,</span> distinct_count<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Min/Max</span>        <span class="token keyword">let</span> <span class="token punctuation">(</span>min<span class="token punctuation">,</span> max<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">if</span> non_null_values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">,</span> <span class="token class-name">None</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> sorted <span class="token operator">=</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> non_null_values<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            sorted<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>sorted<span class="token punctuation">.</span><span class="token function">first</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>sorted<span class="token punctuation">.</span><span class="token function">last</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">ColumnStatistics</span> <span class="token punctuation">&#123;</span>            column_name<span class="token punctuation">:</span> col_name<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            null_fraction<span class="token punctuation">,</span>            distinct_count<span class="token punctuation">,</span>            most_common_values<span class="token punctuation">:</span> mcv<span class="token punctuation">,</span>            histogram<span class="token punctuation">,</span>            min_value<span class="token punctuation">:</span> min<span class="token punctuation">,</span>            max_value<span class="token punctuation">:</span> max<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">compute_histogram</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> values<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Value</span><span class="token punctuation">]</span><span class="token punctuation">,</span> distinct_count<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Histogram</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">const</span> <span class="token constant">NUM_BUCKETS</span><span class="token punctuation">:</span> <span class="token keyword">usize</span> <span class="token operator">=</span> <span class="token number">100</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> values<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Use equi-depth histogram for better selectivity estimation</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> sorted <span class="token operator">=</span> values<span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        sorted<span class="token punctuation">.</span><span class="token function">sort</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> bucket_size <span class="token operator">=</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">/</span> <span class="token constant">NUM_BUCKETS</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> bucket_size <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buckets <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span><span class="token constant">NUM_BUCKETS</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> start <span class="token operator">=</span> i <span class="token operator">*</span> bucket_size<span class="token punctuation">;</span>            <span class="token keyword">let</span> end <span class="token operator">=</span> <span class="token keyword">if</span> i <span class="token operator">==</span> <span class="token constant">NUM_BUCKETS</span> <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">&#123;</span> sorted<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">*</span> bucket_size <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            buckets<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Bucket</span> <span class="token punctuation">&#123;</span>                lower_bound<span class="token punctuation">:</span> sorted<span class="token punctuation">[</span>start<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                upper_bound<span class="token punctuation">:</span> sorted<span class="token punctuation">[</span>end <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                row_count<span class="token punctuation">:</span> <span class="token punctuation">(</span>end <span class="token operator">-</span> start<span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>                distinct_count<span class="token punctuation">:</span> distinct_count <span class="token operator">/</span> <span class="token constant">NUM_BUCKETS</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Histogram</span><span class="token punctuation">::</span><span class="token class-name">EquiDepth</span> <span class="token punctuation">&#123;</span> buckets <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Using-Statistics-ANALYZE-Command">Using Statistics: ANALYZE Command</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token comment">-- Analyze a single table</span><span class="token keyword">ANALYZE</span> users<span class="token punctuation">;</span><span class="token comment">-- Analyze specific columns</span><span class="token keyword">ANALYZE</span> users <span class="token punctuation">(</span>id<span class="token punctuation">,</span> balance<span class="token punctuation">,</span> created_at<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token comment">-- Analyze all tables</span><span class="token keyword">ANALYZE</span><span class="token punctuation">;</span><span class="token comment">-- Configure sampling (for large tables)</span><span class="token keyword">ANALYZE</span> users <span class="token keyword">WITH</span> SAMPLE <span class="token number">0.1</span><span class="token punctuation">;</span>  <span class="token comment">-- 10% sample</span></code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs (extended)</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Statement</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// ... existing statements ...</span>    <span class="token class-name">Analyze</span><span class="token punctuation">(</span><span class="token class-name">AnalyzeStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AnalyzeStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">ObjectName</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Value</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// src/optimizer/analyzer.rs</span><span class="token keyword">impl</span> <span class="token class-name">Database</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> statement<span class="token punctuation">:</span> <span class="token class-name">AnalyzeStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">AnalyzerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span> <span class="token operator">=</span> statement<span class="token punctuation">.</span>table <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Analyzed table &#123;&#125;: &#123;&#125; rows, &#123;&#125; pages"</span><span class="token punctuation">,</span>                      table<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>row_count<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>page_count<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Analyze all tables</span>            <span class="token keyword">for</span> table <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>catalog<span class="token punctuation">.</span><span class="token function">get_all_tables</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>analyzer<span class="token punctuation">.</span><span class="token function">analyze_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>table<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Analyzed table &#123;&#125;: &#123;&#125; rows"</span><span class="token punctuation">,</span> table<span class="token punctuation">,</span> stats<span class="token punctuation">.</span>row_count<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="3-Cost-Models">3 Cost Models</h2><h3 id="The-Cost-Formula">The Cost Formula</h3><pre class="language-none"><code class="language-none">Total Cost &#x3D; CPU Cost + I&#x2F;O Cost + Memory CostWhere:- CPU Cost: Operations per row × number of rows- I&#x2F;O Cost: Pages read&#x2F;written × page cost- Memory Cost: Sort&#x2F;hash memory × memory cost factor</code></pre><hr /><h3 id="Operator-Cost-Models">Operator Cost Models</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/cost_model.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Cost constants (tunable)</span>    <span class="token keyword">pub</span> seq_page_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Cost of sequential page read</span>    <span class="token keyword">pub</span> random_page_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>   <span class="token comment">// Cost of random page read</span>    <span class="token keyword">pub</span> cpu_tuple_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>     <span class="token comment">// CPU cost per tuple</span>    <span class="token keyword">pub</span> cpu_index_tuple_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// CPU cost per index tuple</span>    <span class="token keyword">pub</span> cpu_operator_cost<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>  <span class="token comment">// CPU cost per operator evaluation</span>    <span class="token keyword">pub</span> memory_cost_per_kb<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span> <span class="token comment">// Memory cost per KB</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Default</span> <span class="token keyword">for</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            seq_page_cost<span class="token punctuation">:</span> <span class="token number">1.0</span><span class="token punctuation">,</span>            random_page_cost<span class="token punctuation">:</span> <span class="token number">4.0</span><span class="token punctuation">,</span>  <span class="token comment">// Random I/O is ~4x slower</span>            cpu_tuple_cost<span class="token punctuation">:</span> <span class="token number">0.01</span><span class="token punctuation">,</span>            cpu_index_tuple_cost<span class="token punctuation">:</span> <span class="token number">0.005</span><span class="token punctuation">,</span>            cpu_operator_cost<span class="token punctuation">:</span> <span class="token number">0.0025</span><span class="token punctuation">,</span>            memory_cost_per_kb<span class="token punctuation">:</span> <span class="token number">0.001</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">CostModel</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Cost of sequential scan</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">seq_scan_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        num_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> io_cost <span class="token operator">=</span> num_pages <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>seq_page_cost<span class="token punctuation">;</span>        <span class="token keyword">let</span> cpu_cost <span class="token operator">=</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_tuple_cost<span class="token punctuation">;</span>                <span class="token keyword">let</span> filter_cost <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>_filter<span class="token punctuation">)</span> <span class="token operator">=</span> filter <span class="token punctuation">&#123;</span>            num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token number">0.0</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> <span class="token number">0.0</span><span class="token punctuation">,</span>            total<span class="token punctuation">:</span> io_cost <span class="token operator">+</span> cpu_cost <span class="token operator">+</span> filter_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_rows_after_filter</span><span class="token punctuation">(</span>num_rows<span class="token punctuation">,</span> filter<span class="token punctuation">)</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Would be computed from schema</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of index scan</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">index_scan_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>        table_pages<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        condition<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexCondition</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Estimate how many index pages we need to read</span>        <span class="token keyword">let</span> index_selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_index_selectivity</span><span class="token punctuation">(</span>condition<span class="token punctuation">,</span> index<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> index_pages_to_read <span class="token operator">=</span> <span class="token punctuation">(</span>index<span class="token punctuation">.</span>leaf_pages <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">;</span>                <span class="token comment">// Estimate how many table pages we need to read</span>        <span class="token keyword">let</span> table_pages_to_read <span class="token operator">=</span> <span class="token keyword">if</span> index_selectivity <span class="token operator">></span> <span class="token number">0.3</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// High selectivity → sequential scan of table</span>            table_pages        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Low selectivity → random access</span>            <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> io_cost <span class="token operator">=</span> index_pages_to_read <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>random_page_cost            <span class="token operator">+</span> table_pages_to_read <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>random_page_cost<span class="token punctuation">;</span>                <span class="token keyword">let</span> cpu_cost <span class="token operator">=</span> <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_index_tuple_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> <span class="token number">0.0</span><span class="token punctuation">,</span>            total<span class="token punctuation">:</span> io_cost <span class="token operator">+</span> cpu_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> index_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of nested loop join</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">nested_loop_join_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        outer_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        inner_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        join_selectivity<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Outer is scanned once</span>        <span class="token keyword">let</span> outer_total <span class="token operator">=</span> outer_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Inner is scanned once per outer row</span>        <span class="token keyword">let</span> inner_total <span class="token operator">=</span> inner_cost<span class="token punctuation">.</span>total <span class="token operator">*</span> outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                <span class="token comment">// CPU cost for join condition evaluation</span>        <span class="token keyword">let</span> join_cpu_cost <span class="token operator">=</span> outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> inner_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span>             <span class="token operator">*</span> join_selectivity <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> outer_cost<span class="token punctuation">.</span>startup<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> outer_total <span class="token operator">+</span> inner_total <span class="token operator">+</span> join_cpu_cost<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> <span class="token punctuation">(</span>outer_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> inner_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> join_selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of hash join</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">hash_join_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        left_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        right_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        join_selectivity<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Build phase: scan and hash the smaller relation</span>        <span class="token keyword">let</span> build_cost <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>        <span class="token keyword">let</span> build_memory <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">32.0</span><span class="token punctuation">;</span>  <span class="token comment">// Estimate 32 bytes per row</span>                <span class="token comment">// Probe phase: scan the larger relation and probe hash table</span>        <span class="token keyword">let</span> probe_cost <span class="token operator">=</span> right_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>        <span class="token keyword">let</span> probe_cpu <span class="token operator">=</span> right_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>                <span class="token comment">// Output cost</span>        <span class="token keyword">let</span> output_rows <span class="token operator">=</span> left_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> right_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> join_selectivity<span class="token punctuation">;</span>        <span class="token keyword">let</span> output_cpu <span class="token operator">=</span> output_rows <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_tuple_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> build_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> probe_cost <span class="token operator">+</span> probe_cpu <span class="token operator">+</span> output_cpu<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> output_rows<span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of sort</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">sort_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        input_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        sort_keys<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">SortKey</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> input_total <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Check if sort fits in memory</span>        <span class="token keyword">let</span> sort_memory <span class="token operator">=</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">64.0</span><span class="token punctuation">;</span>  <span class="token comment">// Estimate 64 bytes per row</span>        <span class="token keyword">let</span> work_mem <span class="token operator">=</span> <span class="token number">4</span> <span class="token operator">*</span> <span class="token number">1024</span> <span class="token operator">*</span> <span class="token number">1024.0</span><span class="token punctuation">;</span>  <span class="token comment">// 4MB work memory</span>                <span class="token keyword">let</span> sort_cpu <span class="token operator">=</span> <span class="token keyword">if</span> sort_memory <span class="token operator">&lt;=</span> work_mem <span class="token punctuation">&#123;</span>            <span class="token comment">// In-memory sort: O(n log n)</span>            num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_rows<span class="token punctuation">.</span><span class="token function">log2</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// External sort: 2 passes</span>            <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_rows<span class="token punctuation">.</span><span class="token function">log2</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token number">2.0</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost                <span class="token operator">+</span> sort_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> cpu_per_key <span class="token operator">=</span> sort_keys<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> input_total <span class="token operator">+</span> sort_cpu <span class="token operator">+</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> cpu_per_key<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> input_total <span class="token operator">+</span> sort_cpu <span class="token operator">+</span> num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> cpu_per_key<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> num_rows<span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">/// Cost of hash aggregate</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">hash_aggregate_cost</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        input_cost<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Cost</span><span class="token punctuation">,</span>        num_groups<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>        num_aggregates<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> input_total <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>total<span class="token punctuation">;</span>                <span class="token comment">// Build hash table of groups</span>        <span class="token keyword">let</span> build_memory <span class="token operator">=</span> num_groups <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token number">64.0</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> build_cpu <span class="token operator">=</span> input_cost<span class="token punctuation">.</span>rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>                <span class="token comment">// Aggregate computation</span>        <span class="token keyword">let</span> aggregate_cpu <span class="token operator">=</span> num_groups <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> num_aggregates <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cpu_operator_cost<span class="token punctuation">;</span>        <span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>            startup<span class="token punctuation">:</span> input_total <span class="token operator">+</span> build_memory <span class="token operator">*</span> <span class="token keyword">self</span><span class="token punctuation">.</span>memory_cost_per_kb<span class="token punctuation">,</span>            total<span class="token punctuation">:</span> input_total <span class="token operator">+</span> build_cpu <span class="token operator">+</span> aggregate_cpu<span class="token punctuation">,</span>            rows<span class="token punctuation">:</span> num_groups<span class="token punctuation">,</span>            width<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_rows_after_filter</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> num_rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span> filter<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> filter <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span> <span class="token operator">=></span> num_rows<span class="token punctuation">,</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> selectivity <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_selectivity</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">(</span>num_rows <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">*</span> selectivity<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ceil</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u64</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> expr<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Simplified selectivity estimation</span>        <span class="token comment">// In practice, this would use statistics and histograms</span>        <span class="token keyword">match</span> expr <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token keyword">match</span> op <span class="token punctuation">&#123;</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token number">0.01</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1% match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lt</span> <span class="token operator">|</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gt</span> <span class="token operator">=></span> <span class="token number">0.33</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1/3 match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Lte</span> <span class="token operator">|</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Gte</span> <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>  <span class="token comment">// Assume 1/2 match</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token number">0.1</span><span class="token punctuation">,</span>                <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>                _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token number">0.5</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">estimate_index_selectivity</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> condition<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexCondition</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> condition <span class="token punctuation">&#123;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">(</span>_<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Equality: 1 / distinct_keys</span>                <span class="token number">1.0</span> <span class="token operator">/</span> index<span class="token punctuation">.</span>distinct_keys<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Range: estimate 10% of index</span>                <span class="token number">0.1</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">InList</span><span class="token punctuation">(</span>values<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// IN list: |values| / distinct_keys</span>                values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span> <span class="token operator">/</span> index<span class="token punctuation">.</span>distinct_keys<span class="token punctuation">.</span><span class="token function">max</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Cost</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> startup<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token comment">// Cost to return first row</span>    <span class="token keyword">pub</span> total<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>      <span class="token comment">// Cost to return all rows</span>    <span class="token keyword">pub</span> rows<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>       <span class="token comment">// Estimated output rows</span>    <span class="token keyword">pub</span> width<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>    <span class="token comment">// Estimated row width in bytes</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-Join-Ordering-with-Dynamic-Programming">4 Join Ordering with Dynamic Programming</h2><h3 id="The-Join-Ordering-Problem">The Join Ordering Problem</h3><p><strong>For n tables, there are (n-1)! possible join orders:</strong></p><pre class="language-none"><code class="language-none">3 tables: 2! &#x3D; 2 orders5 tables: 4! &#x3D; 24 orders10 tables: 9! &#x3D; 362,880 orders</code></pre><p><strong>Brute force is impossible.</strong> We need dynamic programming.</p><hr /><h3 id="DP-Join-Ordering-Algorithm">DP Join Ordering Algorithm</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/join_order.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>collections<span class="token punctuation">::</span></span><span class="token class-name">HashMap</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">JoinOrderOptimizer</span> <span class="token punctuation">&#123;</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">JoinOrderOptimizer</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Find the best join order using dynamic programming</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">optimize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> tables<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">String</span><span class="token punctuation">]</span><span class="token punctuation">,</span> conditions<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">JoinCondition</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> n <span class="token operator">=</span> tables<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// dp[i] = best plan for subset represented by bitmask i</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> dp<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token keyword">u64</span><span class="token punctuation">,</span> <span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> costs<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token keyword">u64</span><span class="token punctuation">,</span> <span class="token keyword">f64</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Base case: single table scans</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>i<span class="token punctuation">,</span> table<span class="token punctuation">)</span> <span class="token keyword">in</span> tables<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">enumerate</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> mask <span class="token operator">=</span> <span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">;</span>            <span class="token keyword">let</span> plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_scan_plan</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                        dp<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>mask<span class="token punctuation">,</span> plan<span class="token punctuation">)</span><span class="token punctuation">;</span>            costs<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>mask<span class="token punctuation">,</span> cost<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Build up larger subsets</span>        <span class="token keyword">for</span> size <span class="token keyword">in</span> <span class="token number">2</span><span class="token punctuation">..=</span>n <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> subset <span class="token keyword">in</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subsets_of_size</span><span class="token punctuation">(</span>n<span class="token punctuation">,</span> size<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> subset_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>subset<span class="token punctuation">)</span><span class="token punctuation">;</span>                                <span class="token comment">// Try all ways to split this subset</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_plan<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">None</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_cost <span class="token operator">=</span> <span class="token keyword">f64</span><span class="token punctuation">::</span><span class="token constant">INFINITY</span><span class="token punctuation">;</span>                                <span class="token keyword">for</span> split <span class="token keyword">in</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">split_subset</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>subset<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">let</span> left_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>split<span class="token number">.0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">let</span> right_mask <span class="token operator">=</span> <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">subset_to_mask</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>split<span class="token number">.1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                                        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>left_plan<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>right_plan<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=</span>                         <span class="token punctuation">(</span>dp<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left_mask<span class="token punctuation">)</span><span class="token punctuation">,</span> dp<span class="token punctuation">.</span><span class="token function">get</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>right_mask<span class="token punctuation">)</span><span class="token punctuation">)</span>                     <span class="token punctuation">&#123;</span>                        <span class="token comment">// Try different join algorithms</span>                        <span class="token keyword">for</span> join_plan <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_join_plans</span><span class="token punctuation">(</span>                            left_plan<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                            right_plan<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                            conditions<span class="token punctuation">,</span>                        <span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>join_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                            <span class="token keyword">if</span> cost <span class="token operator">&lt;</span> best_cost <span class="token punctuation">&#123;</span>                                best_cost <span class="token operator">=</span> cost<span class="token punctuation">;</span>                                best_plan <span class="token operator">=</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>join_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                            <span class="token punctuation">&#125;</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>                                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span> <span class="token operator">=</span> best_plan <span class="token punctuation">&#123;</span>                    dp<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>subset_mask<span class="token punctuation">,</span> plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                    costs<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>subset_mask<span class="token punctuation">,</span> best_cost<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Return the best plan for all tables</span>        <span class="token keyword">let</span> all_mask <span class="token operator">=</span> <span class="token punctuation">(</span><span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> n<span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">;</span>        dp<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>all_mask<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_scan_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">PhysicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if we have useful indexes</span>        <span class="token keyword">let</span> stats <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>index<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_useful_index</span><span class="token punctuation">(</span>table<span class="token punctuation">,</span> <span class="token operator">&amp;</span>stats<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span>                table<span class="token punctuation">:</span> table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                alias<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>                index<span class="token punctuation">:</span> index<span class="token punctuation">.</span>name<span class="token punctuation">,</span>                columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// All columns</span>                condition<span class="token punctuation">:</span> <span class="token class-name">IndexCondition</span><span class="token punctuation">::</span><span class="token class-name">Range</span> <span class="token punctuation">&#123;</span> low<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span> high<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span>                table<span class="token punctuation">:</span> table<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                alias<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>                columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span>                filter<span class="token punctuation">:</span> <span class="token class-name">None</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_join_plans</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        left<span class="token punctuation">:</span> <span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span>        conditions<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">JoinCondition</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plans <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Get join condition</span>        <span class="token keyword">let</span> condition <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_join_condition</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left<span class="token punctuation">,</span> <span class="token operator">&amp;</span>right<span class="token punctuation">,</span> conditions<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Nested loop join (always possible)</span>        plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">NestedLoopJoin</span> <span class="token punctuation">&#123;</span>            left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Hash join (if equi-join)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_equi_join</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>condition<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span>                left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Merge join (if inputs can be sorted on join keys)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">can_merge_join</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>left<span class="token punctuation">,</span> <span class="token operator">&amp;</span>right<span class="token punctuation">,</span> <span class="token operator">&amp;</span>condition<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plans<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">MergeJoin</span> <span class="token punctuation">&#123;</span>                left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                condition<span class="token punctuation">:</span> condition<span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                join_type<span class="token punctuation">:</span> <span class="token class-name">JoinType</span><span class="token punctuation">::</span><span class="token class-name">Inner</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                plans    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">subsets_of_size</span><span class="token punctuation">(</span>n<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span> size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">>></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Generate all subsets of &#123;0, 1, ..., n-1&#125; with given size</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> result <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">generate_subsets</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> n<span class="token punctuation">,</span> size<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> result<span class="token punctuation">)</span><span class="token punctuation">;</span>        result    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">generate_subsets</span><span class="token punctuation">(</span>        start<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        n<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        size<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span>        current<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span>        result<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> current<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> size <span class="token punctuation">&#123;</span>            result<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>current<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">return</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">for</span> i <span class="token keyword">in</span> start<span class="token punctuation">..</span>n <span class="token punctuation">&#123;</span>            current<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>i<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token function">generate_subsets</span><span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">,</span> n<span class="token punctuation">,</span> size<span class="token punctuation">,</span> current<span class="token punctuation">,</span> result<span class="token punctuation">)</span><span class="token punctuation">;</span>            current<span class="token punctuation">.</span><span class="token function">pop</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">split_subset</span><span class="token punctuation">(</span>subset<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Generate all non-empty proper splits of the subset</span>        <span class="token keyword">let</span> n <span class="token operator">=</span> subset<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> splits <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Use bitmask to generate all splits</span>        <span class="token keyword">for</span> mask <span class="token keyword">in</span> <span class="token number">1</span><span class="token punctuation">..</span><span class="token punctuation">(</span><span class="token number">1</span> <span class="token operator">&lt;&lt;</span> <span class="token punctuation">(</span>n <span class="token operator">-</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> right <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>n <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> i <span class="token operator">==</span> n <span class="token operator">-</span> <span class="token number">1</span> <span class="token punctuation">&#123;</span>                    right<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> mask <span class="token operator">&amp;</span> <span class="token punctuation">(</span><span class="token number">1</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    right<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>subset<span class="token punctuation">[</span>i<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>                        <span class="token keyword">if</span> <span class="token operator">!</span>left<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span>right<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                splits<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> right<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                splits    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">subset_to_mask</span><span class="token punctuation">(</span>subset<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> mask <span class="token operator">=</span> <span class="token number">0u64</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> <span class="token operator">&amp;</span>i <span class="token keyword">in</span> subset <span class="token punctuation">&#123;</span>            mask <span class="token operator">|=</span> <span class="token number">1u64</span> <span class="token operator">&lt;&lt;</span> i<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        mask    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Join-Ordering-Example">Join Ordering Example</h3><pre class="language-sql" data-language="sql"><code class="language-sql"><span class="token keyword">SELECT</span> <span class="token operator">*</span><span class="token keyword">FROM</span> users u<span class="token keyword">JOIN</span> orders o <span class="token keyword">ON</span> u<span class="token punctuation">.</span>id <span class="token operator">=</span> o<span class="token punctuation">.</span>user_id<span class="token keyword">JOIN</span> products p <span class="token keyword">ON</span> o<span class="token punctuation">.</span>product_id <span class="token operator">=</span> p<span class="token punctuation">.</span>id<span class="token keyword">WHERE</span> u<span class="token punctuation">.</span>balance <span class="token operator">></span> <span class="token number">100</span></code></pre><p><strong>Dynamic Programming Progress:</strong></p><pre class="language-none"><code class="language-none">Iteration 1 (single tables):  &#123;users&#125;: SeqScan cost&#x3D;100, rows&#x3D;10000  &#123;orders&#125;: IndexScan cost&#x3D;50, rows&#x3D;50000  &#123;products&#125;: SeqScan cost&#x3D;10, rows&#x3D;1000Iteration 2 (two tables):  &#123;users, orders&#125;:     - users ⋈ orders (hash): cost&#x3D;600, rows&#x3D;5000    - orders ⋈ users (nested): cost&#x3D;800, rows&#x3D;5000    → Best: HashJoin cost&#x3D;600      &#123;orders, products&#125;:    - orders ⋈ products (hash): cost&#x3D;200, rows&#x3D;10000    → Best: HashJoin cost&#x3D;200Iteration 3 (three tables):  &#123;users, orders, products&#125;:    - &#123;users, orders&#125; ⋈ products: cost&#x3D;800, rows&#x3D;1000    - &#123;orders, products&#125; ⋈ users: cost&#x3D;700, rows&#x3D;1000 ← Best!    - users ⋈ &#123;orders, products&#125;: cost&#x3D;900, rows&#x3D;1000    → Best: (orders ⋈ products) ⋈ users</code></pre><p><strong>Final Plan:</strong></p><pre class="language-none"><code class="language-none">HashAggregate  └─ HashJoin (u.id &#x3D; o.user_id)      ├─ SeqScan (users) [balance &gt; 100]      └─ HashJoin (o.product_id &#x3D; p.id)          ├─ IndexScan (orders)          └─ SeqScan (products)</code></pre><hr /><h2 id="5-Index-Selection">5 Index Selection</h2><h3 id="Choosing-the-Right-Index">Choosing the Right Index</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/index_selector.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexSelector</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">IndexSelector</span> <span class="token punctuation">&#123;</span>    <span class="token comment">/// Find the best index for a query</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">select_index</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        table<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">IndexSelection</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> indexes <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>statistics<span class="token punctuation">.</span><span class="token function">get_table_indexes</span><span class="token punctuation">(</span>table<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_index<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">IndexSelection</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">None</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> best_score <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> index <span class="token keyword">in</span> indexes <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> score <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">score_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> score <span class="token operator">></span> best_score <span class="token punctuation">&#123;</span>                best_score <span class="token operator">=</span> score<span class="token punctuation">;</span>                best_index <span class="token operator">=</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">IndexSelection</span> <span class="token punctuation">&#123;</span>                    index<span class="token punctuation">:</span> index<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    score<span class="token punctuation">,</span>                    usable_predicates<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">find_usable_predicates</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                best_index    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">score_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">f64</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> score <span class="token operator">=</span> <span class="token number">0.0</span><span class="token punctuation">;</span>                <span class="token comment">// Check if index columns are used in predicates</span>        <span class="token keyword">for</span> <span class="token punctuation">(</span>i<span class="token punctuation">,</span> col<span class="token punctuation">)</span> <span class="token keyword">in</span> index<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">enumerate</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> predicate <span class="token keyword">in</span> predicates <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_uses_column</span><span class="token punctuation">(</span>predicate<span class="token punctuation">,</span> col<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Earlier columns in index are more valuable</span>                    <span class="token keyword">let</span> position_weight <span class="token operator">=</span> <span class="token number">1.0</span> <span class="token operator">/</span> <span class="token punctuation">(</span>i <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">f64</span><span class="token punctuation">;</span>                    score <span class="token operator">+=</span> position_weight <span class="token operator">*</span> <span class="token number">100.0</span><span class="token punctuation">;</span>                                        <span class="token comment">// Equality is more valuable than range</span>                    <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_equality_predicate</span><span class="token punctuation">(</span>predicate<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                        score <span class="token operator">*=</span> <span class="token number">2.0</span><span class="token punctuation">;</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Bonus for covering indexes (all columns in index)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">is_covering_index</span><span class="token punctuation">(</span>index<span class="token punctuation">,</span> predicates<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            score <span class="token operator">*=</span> <span class="token number">1.5</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Bonus for unique indexes</span>        <span class="token keyword">if</span> index<span class="token punctuation">.</span>is_unique <span class="token punctuation">&#123;</span>            score <span class="token operator">*=</span> <span class="token number">1.3</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                score    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">find_usable_predicates</span><span class="token punctuation">(</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span>        index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>        predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        predicates            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>p<span class="token closure-punctuation punctuation">|</span></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_uses_index_column</span><span class="token punctuation">(</span>p<span class="token punctuation">,</span> index<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">cloned</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">predicate_uses_column</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicate<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> predicate <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expr_references_column</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> column<span class="token punctuation">)</span> <span class="token operator">||</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expr_references_column</span><span class="token punctuation">(</span>right<span class="token punctuation">,</span> column<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>ident<span class="token punctuation">)</span> <span class="token operator">=></span> ident<span class="token punctuation">.</span>value <span class="token operator">==</span> column<span class="token punctuation">,</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span>idents<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                idents<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">any</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>i<span class="token closure-punctuation punctuation">|</span></span> i<span class="token punctuation">.</span>value <span class="token operator">==</span> column<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token boolean">false</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">is_equality_predicate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> predicate<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Expression</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> predicate <span class="token punctuation">&#123;</span>            <span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> op<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">matches!</span><span class="token punctuation">(</span>op<span class="token punctuation">,</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token boolean">false</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">is_covering_index</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> index<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span> predicates<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Expression</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if all columns referenced in predicates are in the index</span>        <span class="token keyword">let</span> referenced_columns <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">extract_referenced_columns</span><span class="token punctuation">(</span>predicates<span class="token punctuation">)</span><span class="token punctuation">;</span>        referenced_columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">all</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>col<span class="token closure-punctuation punctuation">|</span></span> index<span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span>col<span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">IndexSelection</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> index<span class="token punctuation">:</span> <span class="token class-name">IndexStatistics</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> score<span class="token punctuation">:</span> <span class="token keyword">f64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> usable_predicates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-Complete-Optimization-Pipeline">6 Complete Optimization Pipeline</h2><h3 id="From-AST-to-Physical-Plan">From AST to Physical Plan</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/optimizer/optimizer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">QueryOptimizer</span> <span class="token punctuation">&#123;</span>    catalog<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Catalog</span><span class="token operator">></span><span class="token punctuation">,</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span>    cost_model<span class="token punctuation">:</span> <span class="token class-name">CostModel</span><span class="token punctuation">,</span>    join_optimizer<span class="token punctuation">:</span> <span class="token class-name">JoinOrderOptimizer</span><span class="token punctuation">,</span>    index_selector<span class="token punctuation">:</span> <span class="token class-name">IndexSelector</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">QueryOptimizer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">optimize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> ast<span class="token punctuation">:</span> <span class="token class-name">SelectStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Phase 1: Create logical plan</span>        <span class="token keyword">let</span> logical_plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_logical_plan</span><span class="token punctuation">(</span>ast<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 2: Apply logical optimizations</span>        <span class="token keyword">let</span> optimized_logical <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">apply_logical_optimizations</span><span class="token punctuation">(</span>logical_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 3: Generate physical plans</span>        <span class="token keyword">let</span> physical_plans <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">generate_physical_plans</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>optimized_logical<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Phase 4: Choose best plan based on cost</span>        <span class="token keyword">let</span> best_plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">choose_best_plan</span><span class="token punctuation">(</span>physical_plans<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>best_plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">create_logical_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> ast<span class="token punctuation">:</span> <span class="token class-name">SelectStatement</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">LogicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Start with FROM clause</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_table_with_joins</span><span class="token punctuation">(</span>ast<span class="token punctuation">.</span>from<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Add WHERE filter</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>where_clause<span class="token punctuation">)</span> <span class="token operator">=</span> ast<span class="token punctuation">.</span>where_clause <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Filter</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                predicate<span class="token punctuation">:</span> where_clause<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add GROUP BY / aggregates</span>        <span class="token keyword">if</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>group_by<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>having<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_aggregate</span><span class="token punctuation">(</span>plan<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>group_by<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>having<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add SELECT projection</span>        plan <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">plan_projection</span><span class="token punctuation">(</span>plan<span class="token punctuation">,</span> ast<span class="token punctuation">.</span>projections<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Add DISTINCT</span>        <span class="token keyword">if</span> ast<span class="token punctuation">.</span>distinct <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Distinct</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add ORDER BY</span>        <span class="token keyword">if</span> <span class="token operator">!</span>ast<span class="token punctuation">.</span>order_by<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Sort</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                order_by<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>order_by<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Add LIMIT/OFFSET</span>        <span class="token keyword">if</span> ast<span class="token punctuation">.</span>limit<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ast<span class="token punctuation">.</span>offset<span class="token punctuation">.</span><span class="token function">is_some</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            plan <span class="token operator">=</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">::</span><span class="token class-name">Limit</span> <span class="token punctuation">&#123;</span>                input<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">,</span>                limit<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>limit<span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">::</span><span class="token constant">MAX</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                offset<span class="token punctuation">:</span> ast<span class="token punctuation">.</span>offset<span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">apply_logical_optimizations</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> plan<span class="token punctuation">:</span> <span class="token class-name">LogicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">LogicalPlan</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> optimized <span class="token operator">=</span> plan<span class="token punctuation">;</span>                <span class="token comment">// Predicate pushdown</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">predicate_pushdown</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Projection pruning</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">projection_pruning</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Constant folding</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">constant_folding</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Subquery unnesting</span>        optimized <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">unnest_subqueries</span><span class="token punctuation">(</span>optimized<span class="token punctuation">)</span><span class="token punctuation">;</span>                optimized    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">generate_physical_plans</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> logical<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">LogicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> plans <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Generate all reasonable physical alternatives</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">generate_plans_recursive</span><span class="token punctuation">(</span>logical<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> plans<span class="token punctuation">)</span><span class="token punctuation">;</span>                plans    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">choose_best_plan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> plans<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">,</span> <span class="token class-name">OptimizerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> plans<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">OptimizerError</span><span class="token punctuation">::</span><span class="token class-name">NoPlansGenerated</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> best_plan <span class="token operator">=</span> plans<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> best_cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cost_model<span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>best_plan<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> plan <span class="token keyword">in</span> plans<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">skip</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> cost <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>cost_model<span class="token punctuation">.</span><span class="token function">estimate_cost</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> cost <span class="token operator">&lt;</span> best_cost <span class="token punctuation">&#123;</span>                best_cost <span class="token operator">=</span> cost<span class="token punctuation">;</span>                best_plan <span class="token operator">=</span> plan<span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>best_plan<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-Challenges-Building-in-Rust">7 Challenges Building in Rust</h2><h3 id="Challenge-1-Recursive-Plan-Types">Challenge 1: Recursive Plan Types</h3><p><strong>Problem:</strong> PhysicalPlan is deeply recursive, hard to pattern match.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Complex nested matching</span><span class="token keyword">match</span> plan <span class="token punctuation">&#123;</span>    <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">HashJoin</span> <span class="token punctuation">&#123;</span> left<span class="token punctuation">,</span> right<span class="token punctuation">,</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> left<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">IndexScan</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>            <span class="token class-name">PhysicalPlan</span><span class="token punctuation">::</span><span class="token class-name">SeqScan</span> <span class="token punctuation">&#123;</span> <span class="token punctuation">..</span> <span class="token punctuation">&#125;</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Visitor pattern</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean traversal</span><span class="token keyword">pub</span> <span class="token keyword">trait</span> <span class="token type-definition class-name">PlanVisitor</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">visit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> plan<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">collect_scan_tables</span><span class="token punctuation">(</span>plan<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">PhysicalPlan</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> visitor <span class="token operator">=</span> <span class="token class-name">TableCollector</span> <span class="token punctuation">&#123;</span> tables<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>    visitor<span class="token punctuation">.</span><span class="token function">visit</span><span class="token punctuation">(</span>plan<span class="token punctuation">)</span><span class="token punctuation">;</span>    visitor<span class="token punctuation">.</span>tables<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-2-Cost-Type-Precision">Challenge 2: Cost Type Precision</h3><p><strong>Problem:</strong> Costs can be very large or very small.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ f32 loses precision</span><span class="token keyword">let</span> cost<span class="token punctuation">:</span> <span class="token keyword">f32</span> <span class="token operator">=</span> <span class="token number">1000000.0</span> <span class="token operator">+</span> <span class="token number">0.0001</span><span class="token punctuation">;</span>  <span class="token comment">// Loses 0.0001!</span></code></pre><p><strong>Solution: Use f64</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Better precision</span><span class="token keyword">let</span> cost<span class="token punctuation">:</span> <span class="token keyword">f64</span> <span class="token operator">=</span> <span class="token number">1000000.0</span> <span class="token operator">+</span> <span class="token number">0.0001</span><span class="token punctuation">;</span>  <span class="token comment">// Preserves both</span></code></pre><hr /><h3 id="Challenge-3-Statistics-Lifetime">Challenge 3: Statistics Lifetime</h3><p><strong>Problem:</strong> Statistics need to be shared across optimization.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Optimizer</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">StatisticsCatalog</span><span class="token punctuation">,</span>  <span class="token comment">// Too large to clone</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Arc for shared ownership</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Optimizer</span> <span class="token punctuation">&#123;</span>    statistics<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">StatisticsCatalog</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-How-AI-Accelerated-This">8 How AI Accelerated This</h2><h3 id="What-AI-Got-Right">What AI Got Right</h3><table><thead><tr><th>Task</th><th>AI Contribution</th></tr></thead><tbody><tr><td><strong>Cost model structure</strong></td><td>Good breakdown of CPU/IO/memory costs</td></tr><tr><td><strong>DP join ordering</strong></td><td>Correct bitmask-based subset generation</td></tr><tr><td><strong>Statistics design</strong></td><td>Histogram types, MCV lists</td></tr><tr><td><strong>Index scoring</strong></td><td>Position weight, equality bonus</td></tr></tbody></table><hr /><h3 id="What-AI-Got-Wrong">What AI Got Wrong</h3><table><thead><tr><th>Issue</th><th>What Happened</th></tr></thead><tbody><tr><td><strong>Selectivity estimation</strong></td><td>First draft used fixed values, not histograms</td></tr><tr><td><strong>Join cost formula</strong></td><td>Missed that inner is scanned once per outer row</td></tr><tr><td><strong>Sort cost</strong></td><td>Didn’t distinguish in-memory vs. external sort</td></tr><tr><td><strong>Covering index</strong></td><td>Initial design didn’t consider index-only scans</td></tr></tbody></table><p><strong>Pattern:</strong> AI handles structure well. Numerical formulas and edge cases need manual verification.</p><hr /><h3 id="Example-Debugging-Join-Cost">Example: Debugging Join Cost</h3><p><strong>My question to AI:</strong></p><blockquote><p>“Hash join cost seems wrong. Building hash table should be startup cost, not total.”</p></blockquote><p><strong>What I learned:</strong></p><ol><li><strong>Startup cost:</strong> Cost to return first row</li><li><strong>Total cost:</strong> Cost to return all rows</li><li>Hash build is startup (must complete before probing)</li><li>Probe cost scales with output rows</li></ol><p><strong>Result:</strong> Fixed cost model:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">Cost</span> <span class="token punctuation">&#123;</span>    startup<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> build_memory <span class="token operator">*</span> memory_factor<span class="token punctuation">,</span>  <span class="token comment">// Before first row</span>    total<span class="token punctuation">:</span> build_cost <span class="token operator">+</span> probe_cost <span class="token operator">+</span> output_cpu<span class="token punctuation">,</span>         <span class="token comment">// All rows</span>    rows<span class="token punctuation">:</span> output_rows<span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="Summary-Query-Optimizer-in-One-Diagram">Summary: Query Optimizer in One Diagram</h2><pre class="language-MERMAID_BASE64_608" data-language="MERMAID_BASE64_608"><code class="language-MERMAID_BASE64_608">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiSW5wdXQiCiAgICAgICAgQVtTUUwgQVNUXSAtLT4gQltTdGF0aXN0aWNzIENhdGFsb2ddCiAgICBlbmQKICAgIAogICAgc3ViZ3JhcGggIkxvZ2ljYWwgT3B0aW1pemF0aW9uIgogICAgICAgIENbQ3JlYXRlIExvZ2ljYWwgUGxhbl0gLS0+IERbUHJlZGljYXRlIFB1c2hkb3duXQogICAgICAgIEQgLS0+IEVbUHJvamVjdGlvbiBQcnVuaW5nXQogICAgICAgIEUgLS0+IEZbQ29uc3RhbnQgRm9sZGluZ10KICAgICAgICBGIC0tPiBHW1N1YnF1ZXJ5IFVubmVzdGluZ10KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiUGh5c2ljYWwgT3B0aW1pemF0aW9uIgogICAgICAgIEcgLS0+IEhbR2VuZXJhdGUgUGh5c2ljYWwgUGxhbnNdCiAgICAgICAgSCAtLT4gSVtTZXFTY2FuLCBJbmRleFNjYW5dCiAgICAgICAgSCAtLT4gSltOZXN0ZWRMb29wLCBIYXNoLCBNZXJnZSBKb2luXQogICAgICAgIEggLS0+IEtbU29ydCwgSGFzaEFnZ3JlZ2F0ZV0KICAgICAgICBJICYgSiAmIEsgLS0+IExbQ29zdCBFc3RpbWF0aW9uXQogICAgICAgIEwgLS0+IE1bQ2hvb3NlIEJlc3QgUGxhbl0KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiU3RhdGlzdGljcyIKICAgICAgICBCIC0tPiBOW1RhYmxlIFN0YXRzOiByb3dfY291bnQsIHBhZ2VzXQogICAgICAgIEIgLS0+IE9bQ29sdW1uIFN0YXRzOiBoaXN0b2dyYW0sIE1DVl0KICAgICAgICBCIC0tPiBQW0luZGV4IFN0YXRzOiBkaXN0aW5jdF9rZXlzXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJPdXRwdXQiCiAgICAgICAgTSAtLT4gUVtQaHlzaWNhbCBQbGFuXQogICAgZW5kCiAgICAKICAgIHN0eWxlIEEgZmlsbDojZTNmMmZkLHN0cm9rZTojMTk3NmQyCiAgICBzdHlsZSBRIGZpbGw6I2U4ZjVlOSxzdHJva2U6IzM4OGUzYwogICAgc3R5bGUgTCBmaWxsOiNmZmYzZTAsc3Ryb2tlOiNmNTdjMDA&#x3D;</code></pre><p><strong>Key Takeaways:</strong></p><table><thead><tr><th>Concept</th><th>Why It Matters</th></tr></thead><tbody><tr><td><strong>Logical vs. Physical</strong></td><td>Separate WHAT from HOW</td></tr><tr><td><strong>Statistics</strong></td><td>Accurate cost estimation needs data</td></tr><tr><td><strong>Cost model</strong></td><td>CPU + I/O + memory = total cost</td></tr><tr><td><strong>DP join ordering</strong></td><td>Find optimal order without brute force</td></tr><tr><td><strong>Index selection</strong></td><td>Choose best index for predicates</td></tr><tr><td><strong>Startup vs. Total</strong></td><td>First row latency vs. throughput</td></tr></tbody></table><hr /><p><strong>Further Reading:</strong></p><ul><li>“Database Management Systems” by Ramakrishnan (Ch. 15: Query Optimization)</li><li>“Readings in Database Systems” (Red Book) - Query Optimization chapter</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/tree/master/src/backend/optimizer"><code>src/backend/optimizer/</code></a></li><li>“Cost-Based Oracle Fundamentals” by Jonathan Lewis</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Part 7 of the Vaultgres journey: building a cost-based query optimizer. Deep dive into statistics collection, cost models, join ordering with dynamic programming, and index selection.</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 构建 PostgreSQL 兼容数据库：综合 SQL 解析器（DDL、DML、查询）</title>
    <link href="https://neo01.com/zh-CN/2026/03/Database-Rust-SQL-Parser/"/>
    <id>https://neo01.com/zh-CN/2026/03/Database-Rust-SQL-Parser/</id>
    <published>2026-03-05T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:43.164Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-CN/2026/03/Database-Rust-Wire-Protocol-Result-Set/">第五部分</a> 中，我们构建了 PostgreSQL 通信协议。客户端现在可以连接和发送查询。但有个问题。</p><p><strong>我们收到 SQL 字符串。然后呢？</strong></p><pre class="language-none"><code class="language-none">Client: &quot;SELECT id, name FROM users WHERE balance &gt; 100 ORDER BY name LIMIT 10&quot;Server: ???</code></pre><p>我们可以使用现有的解析器（<code>sqlparser-rs</code>、<code>peg</code> 等）。但构建我们自己的教会我们 SQL 实际上如何运作。</p><p>今天：在 Rust 中构建综合 SQL 解析器——从词法分析器到 AST——用于 DDL、DML 和查询。</p><hr /><h2 id="1-为什么构建-SQL-解析器？">1 为什么构建 SQL 解析器？</h2><h3 id="替代方案">替代方案</h3><table><thead><tr><th>方法</th><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td><strong>sqlparser-rs</strong></td><td>生产就绪，PostgreSQL 方言</td><td>黑盒子，难以自定义</td></tr><tr><td><strong>peg/lalrpop</strong></td><td>生成器处理语法</td><td>学习曲线，调试复杂</td></tr><tr><td><strong>手写</strong></td><td>完全控制，教育性</td><td>耗时，易出错</td></tr></tbody></table><p><strong>Vaultgres 选择：</strong> 手写递归下降解析器。</p><p><strong>为什么？</strong></p><table><thead><tr><th>原因</th><th>解释</th></tr></thead><tbody><tr><td><strong>学习</strong></td><td>深入理解 SQL 语法</td></tr><tr><td><strong>控制</strong></td><td>轻松添加自定义扩展</td></tr><tr><td><strong>错误信息</strong></td><td>比生成器默认更好</td></tr><tr><td><strong>整合</strong></td><td>直接 AST → 执行计划</td></tr></tbody></table><hr /><h3 id="解析器架构">解析器架构</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    SQL Parser Pipeline                       │├─────────────────────────────────────────────────────────────┤│                                                              ││  SQL String                                                  ││     │                                                        ││     ▼                                                        ││  ┌─────────────┐                                            ││  │   Lexer     │  Tokenize: &quot;SELECT&quot; → Token::SELECT        ││  │ (Tokenizer) │  &quot;123&quot; → Token::Integer(123)               ││  └──────┬──────┘                                            ││         │                                                    ││         ▼                                                    ││  ┌─────────────┐                                            ││  │   Parser    │  Recursive descent:                        ││  │             │  parse_statement() → parse_select() → ...  ││  └──────┬──────┘                                            ││         │                                                    ││         ▼                                                    ││  ┌─────────────┐                                            ││  │     AST     │  Structured representation:                ││  │             │  SelectStatement &#123; projections, from, ...&#125; ││  └─────────────┘                                            ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="2-词法分析器：SQL-词法分析">2 词法分析器：SQL 词法分析</h2><h3 id="令牌类型">令牌类型</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/lexer.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Token</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Keywords</span>    <span class="token class-name">Select</span><span class="token punctuation">,</span>    <span class="token class-name">From</span><span class="token punctuation">,</span>    <span class="token class-name">Where</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">,</span>    <span class="token class-name">Into</span><span class="token punctuation">,</span>    <span class="token class-name">Values</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span><span class="token punctuation">,</span>    <span class="token class-name">Set</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span><span class="token punctuation">,</span>    <span class="token class-name">Create</span><span class="token punctuation">,</span>    <span class="token class-name">Table</span><span class="token punctuation">,</span>    <span class="token class-name">Index</span><span class="token punctuation">,</span>    <span class="token class-name">On</span><span class="token punctuation">,</span>    <span class="token class-name">As</span><span class="token punctuation">,</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>    <span class="token class-name">Not</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>    <span class="token class-name">True</span><span class="token punctuation">,</span>    <span class="token class-name">False</span><span class="token punctuation">,</span>    <span class="token class-name">Primary</span><span class="token punctuation">,</span>    <span class="token class-name">Key</span><span class="token punctuation">,</span>    <span class="token class-name">References</span><span class="token punctuation">,</span>    <span class="token class-name">Default</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span><span class="token punctuation">,</span>    <span class="token class-name">Check</span><span class="token punctuation">,</span>    <span class="token class-name">Constraint</span><span class="token punctuation">,</span>    <span class="token class-name">Join</span><span class="token punctuation">,</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">Left</span><span class="token punctuation">,</span>    <span class="token class-name">Right</span><span class="token punctuation">,</span>    <span class="token class-name">Outer</span><span class="token punctuation">,</span>    <span class="token class-name">Order</span><span class="token punctuation">,</span>    <span class="token class-name">By</span><span class="token punctuation">,</span>    <span class="token class-name">Asc</span><span class="token punctuation">,</span>    <span class="token class-name">Desc</span><span class="token punctuation">,</span>    <span class="token class-name">Group</span><span class="token punctuation">,</span>    <span class="token class-name">Having</span><span class="token punctuation">,</span>    <span class="token class-name">Limit</span><span class="token punctuation">,</span>    <span class="token class-name">Offset</span><span class="token punctuation">,</span>    <span class="token class-name">Distinct</span><span class="token punctuation">,</span>        <span class="token comment">// Literals</span>    <span class="token class-name">Integer</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Float</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">String</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Operators</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Star</span><span class="token punctuation">,</span>    <span class="token class-name">Slash</span><span class="token punctuation">,</span>    <span class="token class-name">Eq</span><span class="token punctuation">,</span>    <span class="token class-name">Neq</span><span class="token punctuation">,</span>    <span class="token class-name">Lt</span><span class="token punctuation">,</span>    <span class="token class-name">Lte</span><span class="token punctuation">,</span>    <span class="token class-name">Gt</span><span class="token punctuation">,</span>    <span class="token class-name">Gte</span><span class="token punctuation">,</span>    <span class="token class-name">Arrow</span><span class="token punctuation">,</span>      <span class="token comment">// -></span>    <span class="token class-name">DoubleArrow</span><span class="token punctuation">,</span> <span class="token comment">// ->></span>        <span class="token comment">// Punctuation</span>    <span class="token class-name">Comma</span><span class="token punctuation">,</span>    <span class="token class-name">Semicolon</span><span class="token punctuation">,</span>    <span class="token class-name">LParen</span><span class="token punctuation">,</span>    <span class="token class-name">RParen</span><span class="token punctuation">,</span>    <span class="token class-name">Dot</span><span class="token punctuation">,</span>        <span class="token comment">// Special</span>    <span class="token class-name">Eof</span><span class="token punctuation">,</span>    <span class="token class-name">Unknown</span><span class="token punctuation">(</span><span class="token keyword">char</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="词法分析器实现">词法分析器实现</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/lexer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Lexer</span> <span class="token punctuation">&#123;</span>    input<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">char</span><span class="token operator">></span><span class="token punctuation">,</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Lexer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>input<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            input<span class="token punctuation">:</span> input<span class="token punctuation">.</span><span class="token function">chars</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            pos<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">tokenize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> tokens <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> token <span class="token operator">==</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            tokens<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                tokens<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">next_token</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">skip_whitespace</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> ch <span class="token punctuation">&#123;</span>            <span class="token comment">// Single-character tokens</span>            <span class="token char">'+'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Plus</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'-'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'*'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'/'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Slash</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">','</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Comma</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">';'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Semicolon</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'('</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">')'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'.'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Dot</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                        <span class="token comment">// Multi-character operators</span>            <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'&lt;'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Lte</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    <span class="token char">'>'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Neq</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Lt</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token char">'>'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Gte</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Gt</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token char">'!'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token char">'='</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Neq</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedChar</span><span class="token punctuation">(</span><span class="token char">'!'</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// String literals</span>            ''' <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        <span class="token comment">// Identifiers and keywords</span>            ch <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_alphabetic</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ch <span class="token operator">==</span> <span class="token char">'_'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Numbers</span>            ch <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_numeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_number</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Unknown</span>            ch <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Unknown</span><span class="token punctuation">(</span>ch<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_string</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// Skip opening quote</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> value <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> ''' <span class="token punctuation">&#123;</span>            value<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">::</span><span class="token class-name">UnterminatedString</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// Skip closing quote</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_identifier</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_alphanumeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ch <span class="token operator">==</span> <span class="token char">'_'</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span>start<span class="token punctuation">..</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Check if it's a keyword</span>        <span class="token keyword">let</span> token <span class="token operator">=</span> <span class="token keyword">match</span> value<span class="token punctuation">.</span><span class="token function">to_uppercase</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">as_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token string">"SELECT"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">,</span>            <span class="token string">"FROM"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">,</span>            <span class="token string">"WHERE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Where</span><span class="token punctuation">,</span>            <span class="token string">"INSERT"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Insert</span><span class="token punctuation">,</span>            <span class="token string">"UPDATE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Update</span><span class="token punctuation">,</span>            <span class="token string">"DELETE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Delete</span><span class="token punctuation">,</span>            <span class="token string">"CREATE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">,</span>            <span class="token string">"TABLE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Table</span><span class="token punctuation">,</span>            <span class="token string">"INDEX"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Index</span><span class="token punctuation">,</span>            <span class="token string">"PRIMARY"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">,</span>            <span class="token string">"KEY"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">,</span>            <span class="token string">"NULL"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">,</span>            <span class="token string">"TRUE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">True</span><span class="token punctuation">,</span>            <span class="token string">"FALSE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">False</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_number</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> is_float <span class="token operator">=</span> <span class="token boolean">false</span><span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_numeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> ch <span class="token operator">==</span> <span class="token char">'.'</span> <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span>is_float <span class="token punctuation">&#123;</span>                is_float <span class="token operator">=</span> <span class="token boolean">true</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span>start<span class="token punctuation">..</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> is_float <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">parse</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">parse</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">skip_whitespace</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">is_whitespace</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">current_char</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">char</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">advance</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="词法分析器范例">词法分析器范例</h3><pre class="language-none"><code class="language-none">Input: &quot;SELECT id, name FROM users WHERE balance &gt; 100&quot;Tokens:[    Select,    Identifier(&quot;id&quot;),    Comma,    Identifier(&quot;name&quot;),    From,    Identifier(&quot;users&quot;),    Where,    Identifier(&quot;balance&quot;),    Gt,    Integer(100),    Eof]</code></pre><hr /><h2 id="3-AST：抽象语法树">3 AST：抽象语法树</h2><h3 id="语句类型">语句类型</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Statement</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Select</span><span class="token punctuation">(</span><span class="token class-name">SelectStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">(</span><span class="token class-name">InsertStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span><span class="token punctuation">(</span><span class="token class-name">UpdateStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span><span class="token punctuation">(</span><span class="token class-name">DeleteStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CreateTable</span><span class="token punctuation">(</span><span class="token class-name">CreateTableStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CreateIndex</span><span class="token punctuation">(</span><span class="token class-name">CreateIndexStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">DropTable</span><span class="token punctuation">(</span><span class="token class-name">DropTableStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">DropIndex</span><span class="token punctuation">(</span><span class="token class-name">DropIndexStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="SELECT-语句">SELECT 语句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> distinct<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> projections<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> from<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableWithJoins</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> having<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">OrderByExpr</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> limit<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">SelectItem</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">UnnamedExpr</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">Ident</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Wildcard</span><span class="token punctuation">,</span>  <span class="token comment">// SELECT *</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Ident</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">char</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// "quoted" vs unquoted</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableWithJoins</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">TableFactor</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> joins<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Join</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TableFactor</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableAlias</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Subquery</span> <span class="token punctuation">&#123;</span> query<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableAlias</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableAlias</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Join</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> relation<span class="token punctuation">:</span> <span class="token class-name">TableFactor</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> join_operator<span class="token punctuation">:</span> <span class="token class-name">JoinOperator</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>    <span class="token class-name">RightOuter</span><span class="token punctuation">,</span>    <span class="token class-name">FullOuter</span><span class="token punctuation">,</span>    <span class="token class-name">Cross</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="DML-语句">DML 语句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// INSERT</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">InsertStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// UPDATE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">UpdateStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> assignments<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Assignment</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Assignment</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> column<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> value<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// DELETE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DeleteStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="DDL-语句">DDL 语句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// CREATE TABLE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CreateTableStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">ColumnDef</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> constraints<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TableConstraint</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> if_not_exists<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ColumnDef</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data_type<span class="token punctuation">:</span> <span class="token class-name">DataType</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">ColumnOption</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">DataType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Boolean</span><span class="token punctuation">,</span>    <span class="token class-name">SmallInt</span><span class="token punctuation">,</span>    <span class="token class-name">Integer</span><span class="token punctuation">,</span>    <span class="token class-name">BigInt</span><span class="token punctuation">,</span>    <span class="token class-name">Real</span><span class="token punctuation">,</span>    <span class="token class-name">Double</span><span class="token punctuation">,</span>    <span class="token class-name">Text</span><span class="token punctuation">,</span>    <span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// None = VARCHAR, Some(n) = VARCHAR(n)</span>    <span class="token class-name">Timestamp</span><span class="token punctuation">,</span>    <span class="token class-name">Date</span><span class="token punctuation">,</span>    <span class="token class-name">Bytea</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ColumnOption</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">NotNull</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>    <span class="token class-name">Default</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span><span class="token punctuation">,</span>    <span class="token class-name">PrimaryKey</span><span class="token punctuation">,</span>    <span class="token class-name">References</span> <span class="token punctuation">&#123;</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token class-name">Ident</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TableConstraint</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">PrimaryKey</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Check</span> <span class="token punctuation">&#123;</span> expression<span class="token punctuation">:</span> <span class="token class-name">Expression</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// CREATE INDEX</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CreateIndexStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">OrderByExpr</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> unique<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> if_not_exists<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="表达式：SQL-的核心">表达式：SQL 的核心</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Literals</span>    <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// table.column</span>    <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralFloat</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralString</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token keyword">bool</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>        <span class="token comment">// Operators</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>        op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">,</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Function calls</span>    <span class="token class-name">Function</span> <span class="token punctuation">&#123;</span>        name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>        args<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FunctionArg</span><span class="token operator">></span><span class="token punctuation">,</span>        distinct<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Subqueries</span>    <span class="token class-name">Subquery</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// CASE expressions</span>    <span class="token class-name">Case</span> <span class="token punctuation">&#123;</span>        operand<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>        conditions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">WhenClause</span><span class="token operator">></span><span class="token punctuation">,</span>        else_result<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// IN, BETWEEN, LIKE</span>    <span class="token class-name">InList</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        list<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">InSubquery</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        subquery<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Between</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        low<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        high<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Like</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        pattern<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// CAST</span>    <span class="token class-name">Cast</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        data_type<span class="token punctuation">:</span> <span class="token class-name">DataType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Parenthesized expressions</span>    <span class="token class-name">Nested</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">BinaryOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Multiply</span><span class="token punctuation">,</span>    <span class="token class-name">Divide</span><span class="token punctuation">,</span>    <span class="token class-name">Eq</span><span class="token punctuation">,</span>    <span class="token class-name">Neq</span><span class="token punctuation">,</span>    <span class="token class-name">Lt</span><span class="token punctuation">,</span>    <span class="token class-name">Lte</span><span class="token punctuation">,</span>    <span class="token class-name">Gt</span><span class="token punctuation">,</span>    <span class="token class-name">Gte</span><span class="token punctuation">,</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>    <span class="token class-name">Like</span><span class="token punctuation">,</span>    <span class="token class-name">NotLike</span><span class="token punctuation">,</span>    <span class="token class-name">Concat</span><span class="token punctuation">,</span>  <span class="token comment">// ||</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">UnaryOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Not</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WhenClause</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> result<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">FunctionArg</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Named</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span> arg<span class="token punctuation">:</span> <span class="token class-name">Expression</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Unnamed</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-解析器：递归下降">4 解析器：递归下降</h2><h3 id="解析器结构">解析器结构</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>sql_parser<span class="token punctuation">::</span>lexer<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">Lexer</span><span class="token punctuation">,</span> <span class="token class-name">Token</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>sql_parser<span class="token punctuation">::</span>ast<span class="token punctuation">::</span></span><span class="token operator">*</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> tokens<span class="token punctuation">,</span> pos<span class="token punctuation">:</span> <span class="token number">0</span> <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parse_statement</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_select</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_insert</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Insert</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_update</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Update</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_delete</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Delete</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_select</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// DISTINCT</span>        <span class="token keyword">let</span> distinct <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Distinct</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Projections</span>        <span class="token keyword">let</span> projections <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_projection_list</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// FROM clause</span>        <span class="token keyword">let</span> from <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_table_with_joins</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// WHERE clause</span>        <span class="token keyword">let</span> where_clause <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Where</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// GROUP BY</span>        <span class="token keyword">let</span> group_by <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Group</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">By</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_comma_separated</span><span class="token punctuation">(</span><span class="token class-name">Parser</span><span class="token punctuation">::</span>parse_expression<span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// HAVING</span>        <span class="token keyword">let</span> having <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Having</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// ORDER BY</span>        <span class="token keyword">let</span> order_by <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Order</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">By</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_order_by_list</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// LIMIT</span>        <span class="token keyword">let</span> limit <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Limit</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// OFFSET</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Offset</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>            distinct<span class="token punctuation">,</span>            projections<span class="token punctuation">,</span>            from<span class="token punctuation">,</span>            where_clause<span class="token punctuation">,</span>            group_by<span class="token punctuation">,</span>            having<span class="token punctuation">,</span>            order_by<span class="token punctuation">,</span>            limit<span class="token punctuation">,</span>            offset<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="带优先级的表达式解析">带优先级的表达式解析</h3><p><strong>挑战：</strong> <code>1 + 2 * 3</code> 应该解析为 <code>1 + (2 * 3)</code>，而不是 <code>(1 + 2) * 3</code>。</p><p><strong>解决方案：</strong> Pratt 解析（优先级爬升）。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_expression</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_precedence</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> min_precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Parse left side (prefix)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_prefix</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Get operator precedence</span>            <span class="token keyword">let</span> op_precedence <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_operator_precedence</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token comment">// Stop if operator has lower precedence</span>            <span class="token keyword">if</span> op_precedence <span class="token operator">&lt;</span> min_precedence <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Parse operator and right side</span>            left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_infix</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> op_precedence<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_prefix</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Check for compound identifier (table.column)</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Dot</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>col<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>                        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span><span class="token macro property">vec!</span><span class="token punctuation">[</span>                            <span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> name<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                            <span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> col<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                        <span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> name<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralFloat</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralString</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">True</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">False</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token boolean">false</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>                    op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">,</span>                    expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>                    op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">,</span>                    expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Nested</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> <span class="token string">"*"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_infix</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> left<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span> precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> op_token <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> op_token <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Plus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Plus</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Multiply</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Slash</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Divide</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">And</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Or</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// ... more operators</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span>op_token<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Precedence levels</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Precedence</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Lowest</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>           <span class="token comment">// OR</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>          <span class="token comment">// AND</span>    <span class="token class-name">Comparison</span><span class="token punctuation">,</span>   <span class="token comment">// =, &lt;, >, &lt;=, >=, &lt;></span>    <span class="token class-name">Concat</span><span class="token punctuation">,</span>       <span class="token comment">// ||</span>    <span class="token class-name">AddSub</span><span class="token punctuation">,</span>       <span class="token comment">// +, -</span>    <span class="token class-name">MulDiv</span><span class="token punctuation">,</span>       <span class="token comment">// *, /</span>    <span class="token class-name">Unary</span><span class="token punctuation">,</span>        <span class="token comment">// NOT, -</span>    <span class="token class-name">Exponent</span><span class="token punctuation">,</span>     <span class="token comment">// ^</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Precedence</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Precedence</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Or</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">And</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Comparison</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Comparison</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Concat</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Concat</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">AddSub</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">AddSub</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">MulDiv</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">MulDiv</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="解析-DDL：CREATE-TABLE">解析 DDL：CREATE TABLE</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_create</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Table</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create_table</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Index</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create_index</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedTableOrIndex</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_create_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> if_not_exists <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">If</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Exists</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token boolean">true</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> name <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_object_name</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> columns <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> constraints <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Check for constraint</span>            <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Constraint</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> constraint <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_table_constraint</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                constraints<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>constraint<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> cols <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_comma_separated_identifiers</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                constraints<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">TableConstraint</span><span class="token punctuation">::</span><span class="token class-name">PrimaryKey</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> cols <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Column definition</span>                <span class="token keyword">let</span> column <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_column_def</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                columns<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>column<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>                        <span class="token keyword">if</span> <span class="token operator">!</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Comma</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">CreateTable</span><span class="token punctuation">(</span><span class="token class-name">CreateTableStatement</span> <span class="token punctuation">&#123;</span>            name<span class="token punctuation">,</span>            columns<span class="token punctuation">,</span>            constraints<span class="token punctuation">,</span>            if_not_exists<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_column_def</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">ColumnDef</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> name <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> data_type <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_data_type</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> options <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">NotNull</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Default</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Default</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">PrimaryKey</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Unique</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Unique</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">References</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> table <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_object_name</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> column <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">References</span> <span class="token punctuation">&#123;</span> table<span class="token punctuation">,</span> column <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">ColumnDef</span> <span class="token punctuation">&#123;</span>            name<span class="token punctuation">,</span>            data_type<span class="token punctuation">,</span>            options<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_data_type</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">DataType</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> name<span class="token punctuation">.</span><span class="token function">to_uppercase</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">as_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token string">"BOOLEAN"</span> <span class="token operator">|</span> <span class="token string">"BOOL"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Boolean</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"SMALLINT"</span> <span class="token operator">|</span> <span class="token string">"INT2"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">SmallInt</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"INTEGER"</span> <span class="token operator">|</span> <span class="token string">"INT"</span> <span class="token operator">|</span> <span class="token string">"INT4"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"BIGINT"</span> <span class="token operator">|</span> <span class="token string">"INT8"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">BigInt</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"REAL"</span> <span class="token operator">|</span> <span class="token string">"FLOAT4"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Real</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"DOUBLE"</span> <span class="token operator">|</span> <span class="token string">"FLOAT8"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Double</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"TEXT"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Text</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"VARCHAR"</span> <span class="token operator">|</span> <span class="token string">"CHARACTER VARYING"</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                            <span class="token keyword">let</span> size <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_integer_literal</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>size<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                    <span class="token string">"TIMESTAMP"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Timestamp</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"DATE"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Date</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"BYTEA"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Bytea</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnknownDataType</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedDataType</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-完整解析范例">5 完整解析范例</h2><h3 id="解析复杂查询">解析复杂查询</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Example usage</span><span class="token keyword">let</span> sql <span class="token operator">=</span> <span class="token string">r#"    SELECT         u.id,        u.name,        COUNT(o.id) as order_count,        SUM(o.amount) as total_amount    FROM users u    LEFT JOIN orders o ON u.id = o.user_id    WHERE u.balance > 100 AND u.created_at > '2026-01-01'    GROUP BY u.id, u.name    HAVING COUNT(o.id) > 5    ORDER BY total_amount DESC    LIMIT 10    OFFSET 5"#</span><span class="token punctuation">;</span><span class="token keyword">let</span> <span class="token keyword">mut</span> lexer <span class="token operator">=</span> <span class="token class-name">Lexer</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>sql<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">let</span> tokens <span class="token operator">=</span> lexer<span class="token punctuation">.</span><span class="token function">tokenize</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span><span class="token keyword">let</span> <span class="token keyword">mut</span> parser <span class="token operator">=</span> <span class="token class-name">Parser</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">let</span> ast <span class="token operator">=</span> parser<span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span><span class="token comment">// ast is now a SelectStatement with:</span><span class="token comment">// - 4 projections (id, name, COUNT, SUM)</span><span class="token comment">// - FROM users with LEFT JOIN orders</span><span class="token comment">// - WHERE clause with AND</span><span class="token comment">// - GROUP BY 2 columns</span><span class="token comment">// - HAVING clause</span><span class="token comment">// - ORDER BY with DESC</span><span class="token comment">// - LIMIT and OFFSET</span></code></pre><p><strong>结果 AST（简化）：</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>    distinct<span class="token punctuation">:</span> <span class="token boolean">false</span><span class="token punctuation">,</span>    projections<span class="token punctuation">:</span> <span class="token punctuation">[</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"order_count"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"COUNT"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.id"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"order_count"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"SUM"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.amount"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"total_amount"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    from<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">TableWithJoins</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"users"</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"u"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        joins<span class="token punctuation">:</span> <span class="token punctuation">[</span>            <span class="token class-name">Join</span> <span class="token punctuation">&#123;</span>                relation<span class="token punctuation">:</span> <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"orders"</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"o"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                join_operator<span class="token punctuation">:</span> <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    where_clause<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.balance"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">100</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">And</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.created_at"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralString</span><span class="token punctuation">(</span><span class="token string">"2026-01-01"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    group_by<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.name"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    having<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        <span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"COUNT"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.id"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    order_by<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token class-name">OrderByExpr</span> <span class="token punctuation">&#123;</span> expr<span class="token punctuation">:</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"total_amount"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> asc<span class="token punctuation">:</span> <span class="token boolean">false</span> <span class="token punctuation">&#125;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    limit<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    offset<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-错误处理和恢复">6 错误处理和恢复</h2><h3 id="解析器错误">解析器错误</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/error.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Lexer errors</span>    <span class="token class-name">LexerError</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Syntax errors</span>    <span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedIdentifier</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedDataType</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedTableOrIndex</span><span class="token punctuation">,</span>        <span class="token comment">// Semantic errors (detected during parsing)</span>    <span class="token class-name">UnknownDataType</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">UnknownFunction</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">AmbiguousColumn</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Recovery</span>    <span class="token class-name">UnexpectedEof</span><span class="token punctuation">,</span>    <span class="token class-name">UnmatchedParenthesis</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">LexerError</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">UnterminatedString</span><span class="token punctuation">,</span>    <span class="token class-name">InvalidNumber</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">UnexpectedChar</span><span class="token punctuation">(</span><span class="token keyword">char</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">From</span><span class="token operator">&lt;</span><span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token keyword">for</span> <span class="token class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">from</span><span class="token punctuation">(</span>err<span class="token punctuation">:</span> <span class="token class-name">LexerError</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">LexerError</span><span class="token punctuation">(</span>err<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="错误恢复策略">错误恢复策略</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_statement_with_recovery</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start_pos <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span>stmt<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>stmt<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span>err<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Try to recover: skip to next semicolon or EOF</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">recover_to_statement_boundary</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                                <span class="token comment">// Return error with context</span>                <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">SyntaxError</span> <span class="token punctuation">&#123;</span>                    message<span class="token punctuation">:</span> <span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"Parse error: &#123;&#125;"</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span><span class="token punctuation">,</span>                    position<span class="token punctuation">:</span> start_pos<span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">recover_to_statement_boundary</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tokens<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Semicolon</span> <span class="token operator">|</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">return</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                _ <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-构建的挑战">7 用 Rust 构建的挑战</h2><h3 id="挑战-1：递归类型">挑战 1：递归类型</h3><p><strong>问题：</strong> AST 有递归类型（Expression 包含 Expression）。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile - infinite size</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>  <span class="token comment">// How big is this?</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：Box 用于间接</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - known size (pointer size)</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-2：生命周期注解">挑战 2：生命周期注解</h3><p><strong>问题：</strong> 令牌从输入借用，解析器需要参考它们。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile - lifetime issues</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Token</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Borrowed slice</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案： owned 令牌</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - owns its data</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Owned vector</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>权衡：</strong> 额外配置，但更简单的生命周期。</p><hr /><h3 id="挑战-3：错误类型复杂性">挑战 3：错误类型复杂性</h3><p><strong>问题：</strong> 许多错误变体，难以模式匹配。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Unwieldy</span><span class="token keyword">match</span> err <span class="token punctuation">&#123;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token comment">// ... 50 more cases</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：Display 特性和上下文</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean</span><span class="token keyword">impl</span> <span class="token class-name">Display</span> <span class="token keyword">for</span> <span class="token class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">fmt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> f<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token namespace">std<span class="token punctuation">::</span>fmt<span class="token punctuation">::</span></span><span class="token class-name">Formatter</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token namespace">std<span class="token punctuation">::</span>fmt<span class="token punctuation">::</span></span><span class="token class-name">Result</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">write!</span><span class="token punctuation">(</span>f<span class="token punctuation">,</span> <span class="token string">"Unexpected token: &#123;&#125;"</span><span class="token punctuation">,</span> token<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">write!</span><span class="token punctuation">(</span>f<span class="token punctuation">,</span> <span class="token string">"Expected identifier"</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// ...</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage</span><span class="token keyword">let</span> result <span class="token operator">=</span> parser<span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">.</span><span class="token function">map_err</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>e<span class="token closure-punctuation punctuation">|</span></span> <span class="token macro property">eprintln!</span><span class="token punctuation">(</span><span class="token string">"Parse error at position &#123;&#125;: &#123;&#125;"</span><span class="token punctuation">,</span> parser<span class="token punctuation">.</span>pos<span class="token punctuation">,</span> e<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span></code></pre><hr /><h2 id="8-AI-如何加速这项工作">8 AI 如何加速这项工作</h2><h3 id="AI-做对了什么">AI 做对了什么</h3><table><thead><tr><th>任务</th><th>AI 贡献</th></tr></thead><tbody><tr><td><strong>词法分析器结构</strong></td><td>逐字符词法分析模式</td></tr><tr><td><strong>优先级层级</strong></td><td>正确的运算符优先级顺序</td></tr><tr><td><strong>AST 设计</strong></td><td>综合的表达式变体</td></tr><tr><td><strong>错误类型</strong></td><td>错误情况的良好分类</td></tr></tbody></table><hr /><h3 id="AI-做错了什么">AI 做错了什么</h3><table><thead><tr><th>问题</th><th>发生什么事</th></tr></thead><tbody><tr><td><strong>复合标识符</strong></td><td>初稿没有处理 <code>table.column</code></td></tr><tr><td><strong>JOIN 解析</strong></td><td>忽略了 ON vs. USING 子句区别</td></tr><tr><td><strong>CASE 表达式</strong></td><td>产生不完整的 WHEN/THEN 处理</td></tr><tr><td><strong>优先级爬升</strong></td><td>建议没有优先级的递归下降（对表达式错误）</td></tr></tbody></table><p><strong>模式：</strong> AI 处理常见情况良好。边界情况（复合标识符、JOIN 变体）需要手动精炼。</p><hr /><h3 id="范例：调试表达式解析">范例：调试表达式解析</h3><p><strong>我问 AI 的问题：</strong></p><blockquote><p>“<code>1 + 2 * 3</code> 解析为 <code>(1 + 2) * 3</code>。为什么？”</p></blockquote><p><strong>我学到的：</strong></p><ol><li>简单递归下降不处理优先级</li><li>需要 Pratt 解析或优先级爬升</li><li>每个运算符需要优先级层级</li></ol><p><strong>结果：</strong> 实现基于优先级的解析：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">fn</span> <span class="token function-definition function">parse_expression</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">parse_precedence</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> min_precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_prefix</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> op_precedence <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_operator_precedence</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> op_precedence <span class="token operator">&lt;</span> min_precedence <span class="token punctuation">&#123;</span>            <span class="token keyword">break</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_infix</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> op_precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="总结：SQL-解析器一张图">总结：SQL 解析器一张图</h2><pre class="language-MERMAID_BASE64_609" data-language="MERMAID_BASE64_609"><code class="language-MERMAID_BASE64_609">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiTGV4ZXIiCiAgICAgICAgQVtTUUwgU3RyaW5nXSAtLT4gQltDaGFyYWN0ZXIgU3RyZWFtXQogICAgICAgIEIgLS0+IENbVG9rZW4gU3RyZWFtXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJQYXJzZXIiCiAgICAgICAgQyAtLT4gRFtwYXJzZV9zdGF0ZW1lbnRdCiAgICAgICAgRCAtLT4gRXtTdGF0ZW1lbnQgVHlwZT99CiAgICAgICAgRSAtLT58U0VMRUNUfCBGW3BhcnNlX3NlbGVjdF0KICAgICAgICBFIC0tPnxJTlNFUlR8IEdbcGFyc2VfaW5zZXJ0XQogICAgICAgIEUgLS0+fFVQREFURXwgSFtwYXJzZV91cGRhdGVdCiAgICAgICAgRSAtLT58REVMRVRFfCBJW3BhcnNlX2RlbGV0ZV0KICAgICAgICBFIC0tPnxDUkVBVEV8IEpbcGFyc2VfY3JlYXRlXQogICAgICAgIAogICAgICAgIEYgLS0+IEtbcGFyc2VfZXhwcmVzc2lvbl0KICAgICAgICBLIC0tPiBMW1ByYXR0IFBhcnNpbmddCiAgICAgICAgTCAtLT4gTVtFeHByZXNzaW9uIEFTVF0KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiQVNUIgogICAgICAgIE0gLS0+IE5bU2VsZWN0U3RhdGVtZW50XQogICAgICAgIE4gLS0+IE9bUHJvamVjdGlvbnNdCiAgICAgICAgTiAtLT4gUFtGcm9tL0pvaW5zXQogICAgICAgIE4gLS0+IFFbV2hlcmVdCiAgICAgICAgTiAtLT4gUltHcm91cCBCeS9IYXZpbmddCiAgICAgICAgTiAtLT4gU1tPcmRlciBCeS9MaW1pdF0KICAgIGVuZAogICAgCiAgICBzdHlsZSBBIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgQyBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIE4gZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAw</code></pre><p><strong>关键要点：</strong></p><table><thead><tr><th>概念</th><th>为什么重要</th></tr></thead><tbody><tr><td><strong>词法分析器</strong></td><td>将 SQL 词法分析为有意义的单元</td></tr><tr><td><strong>递归下降</strong></td><td>自顶向下解析，每个语法规则一个函数</td></tr><tr><td><strong>Pratt 解析</strong></td><td>正确处理运算符优先级</td></tr><tr><td><strong>AST 设计</strong></td><td>用于查询计划的结构化表示</td></tr><tr><td><strong>错误恢复</strong></td><td>错误后继续解析以获得更好的信息</td></tr><tr><td><strong>Box 用于递归</strong></td><td>Rust 需要已知大小的类型</td></tr></tbody></table><hr /><p><strong>进一步阅读：</strong></p><ul><li>“Crafting Interpreters” by Robert Nystrom (free online) - 优秀的解析器教程</li><li>“Programming Language Pragmatics” by Scott - 编译器设计基础</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/tree/master/src/backend/parser"><code>src/backend/parser/</code></a></li><li>sqlparser-rs: <a href="https://github.com/sqlparser-rs/sqlparser-rs">github.com/sqlparser-rs/sqlparser-rs</a></li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第六部分：从头构建 SQL 解析器。深入探讨词法分析、递归下降解析、DDL/DML/查询的 AST 设计，以及运算符优先级处理。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 建構 PostgreSQL 相容資料庫：綜合 SQL 解析器（DDL、DML、查詢）</title>
    <link href="https://neo01.com/zh-TW/2026/03/Database-Rust-SQL-Parser/"/>
    <id>https://neo01.com/zh-TW/2026/03/Database-Rust-SQL-Parser/</id>
    <published>2026-03-05T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:46.382Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-TW/2026/03/Database-Rust-Wire-Protocol-Result-Set/">第五部分</a> 中，我們建構了 PostgreSQL 通訊協定。客戶端現在可以連接和發送查詢。但有個問題。</p><p><strong>我們收到 SQL 字串。然後呢？</strong></p><pre class="language-none"><code class="language-none">Client: &quot;SELECT id, name FROM users WHERE balance &gt; 100 ORDER BY name LIMIT 10&quot;Server: ???</code></pre><p>我們可以使用現有的解析器（<code>sqlparser-rs</code>、<code>peg</code> 等）。但建構我們自己的教會我們 SQL 實際上如何運作。</p><p>今天：在 Rust 中建構綜合 SQL 解析器——從詞法分析器到 AST——用於 DDL、DML 和查詢。</p><hr /><h2 id="1-為什麼建構-SQL-解析器？">1 為什麼建構 SQL 解析器？</h2><h3 id="替代方案">替代方案</h3><table><thead><tr><th>方法</th><th>優點</th><th>缺點</th></tr></thead><tbody><tr><td><strong>sqlparser-rs</strong></td><td>生產就緒，PostgreSQL 方言</td><td>黑盒子，難以自訂</td></tr><tr><td><strong>peg/lalrpop</strong></td><td>產生器處理語法</td><td>學習曲線，除錯複雜</td></tr><tr><td><strong>手寫</strong></td><td>完全控制，教育性</td><td>耗時，易出錯</td></tr></tbody></table><p><strong>Vaultgres 選擇：</strong> 手寫遞歸下降解析器。</p><p><strong>為什麼？</strong></p><table><thead><tr><th>原因</th><th>解釋</th></tr></thead><tbody><tr><td><strong>學習</strong></td><td>深入理解 SQL 語法</td></tr><tr><td><strong>控制</strong></td><td>輕鬆添加自訂擴充</td></tr><tr><td><strong>錯誤訊息</strong></td><td>比產生器預設更好</td></tr><tr><td><strong>整合</strong></td><td>直接 AST → 執行計劃</td></tr></tbody></table><hr /><h3 id="解析器架構">解析器架構</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    SQL Parser Pipeline                       │├─────────────────────────────────────────────────────────────┤│                                                              ││  SQL String                                                  ││     │                                                        ││     ▼                                                        ││  ┌─────────────┐                                            ││  │   Lexer     │  Tokenize: &quot;SELECT&quot; → Token::SELECT        ││  │ (Tokenizer) │  &quot;123&quot; → Token::Integer(123)               ││  └──────┬──────┘                                            ││         │                                                    ││         ▼                                                    ││  ┌─────────────┐                                            ││  │   Parser    │  Recursive descent:                        ││  │             │  parse_statement() → parse_select() → ...  ││  └──────┬──────┘                                            ││         │                                                    ││         ▼                                                    ││  ┌─────────────┐                                            ││  │     AST     │  Structured representation:                ││  │             │  SelectStatement &#123; projections, from, ...&#125; ││  └─────────────┘                                            ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="2-詞法分析器：SQL-詞法分析">2 詞法分析器：SQL 詞法分析</h2><h3 id="令牌類型">令牌類型</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/lexer.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Token</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Keywords</span>    <span class="token class-name">Select</span><span class="token punctuation">,</span>    <span class="token class-name">From</span><span class="token punctuation">,</span>    <span class="token class-name">Where</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">,</span>    <span class="token class-name">Into</span><span class="token punctuation">,</span>    <span class="token class-name">Values</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span><span class="token punctuation">,</span>    <span class="token class-name">Set</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span><span class="token punctuation">,</span>    <span class="token class-name">Create</span><span class="token punctuation">,</span>    <span class="token class-name">Table</span><span class="token punctuation">,</span>    <span class="token class-name">Index</span><span class="token punctuation">,</span>    <span class="token class-name">On</span><span class="token punctuation">,</span>    <span class="token class-name">As</span><span class="token punctuation">,</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>    <span class="token class-name">Not</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>    <span class="token class-name">True</span><span class="token punctuation">,</span>    <span class="token class-name">False</span><span class="token punctuation">,</span>    <span class="token class-name">Primary</span><span class="token punctuation">,</span>    <span class="token class-name">Key</span><span class="token punctuation">,</span>    <span class="token class-name">References</span><span class="token punctuation">,</span>    <span class="token class-name">Default</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span><span class="token punctuation">,</span>    <span class="token class-name">Check</span><span class="token punctuation">,</span>    <span class="token class-name">Constraint</span><span class="token punctuation">,</span>    <span class="token class-name">Join</span><span class="token punctuation">,</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">Left</span><span class="token punctuation">,</span>    <span class="token class-name">Right</span><span class="token punctuation">,</span>    <span class="token class-name">Outer</span><span class="token punctuation">,</span>    <span class="token class-name">Order</span><span class="token punctuation">,</span>    <span class="token class-name">By</span><span class="token punctuation">,</span>    <span class="token class-name">Asc</span><span class="token punctuation">,</span>    <span class="token class-name">Desc</span><span class="token punctuation">,</span>    <span class="token class-name">Group</span><span class="token punctuation">,</span>    <span class="token class-name">Having</span><span class="token punctuation">,</span>    <span class="token class-name">Limit</span><span class="token punctuation">,</span>    <span class="token class-name">Offset</span><span class="token punctuation">,</span>    <span class="token class-name">Distinct</span><span class="token punctuation">,</span>        <span class="token comment">// Literals</span>    <span class="token class-name">Integer</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Float</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">String</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Operators</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Star</span><span class="token punctuation">,</span>    <span class="token class-name">Slash</span><span class="token punctuation">,</span>    <span class="token class-name">Eq</span><span class="token punctuation">,</span>    <span class="token class-name">Neq</span><span class="token punctuation">,</span>    <span class="token class-name">Lt</span><span class="token punctuation">,</span>    <span class="token class-name">Lte</span><span class="token punctuation">,</span>    <span class="token class-name">Gt</span><span class="token punctuation">,</span>    <span class="token class-name">Gte</span><span class="token punctuation">,</span>    <span class="token class-name">Arrow</span><span class="token punctuation">,</span>      <span class="token comment">// -></span>    <span class="token class-name">DoubleArrow</span><span class="token punctuation">,</span> <span class="token comment">// ->></span>        <span class="token comment">// Punctuation</span>    <span class="token class-name">Comma</span><span class="token punctuation">,</span>    <span class="token class-name">Semicolon</span><span class="token punctuation">,</span>    <span class="token class-name">LParen</span><span class="token punctuation">,</span>    <span class="token class-name">RParen</span><span class="token punctuation">,</span>    <span class="token class-name">Dot</span><span class="token punctuation">,</span>        <span class="token comment">// Special</span>    <span class="token class-name">Eof</span><span class="token punctuation">,</span>    <span class="token class-name">Unknown</span><span class="token punctuation">(</span><span class="token keyword">char</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="詞法分析器實作">詞法分析器實作</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/lexer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Lexer</span> <span class="token punctuation">&#123;</span>    input<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">char</span><span class="token operator">></span><span class="token punctuation">,</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Lexer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>input<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            input<span class="token punctuation">:</span> input<span class="token punctuation">.</span><span class="token function">chars</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            pos<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">tokenize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> tokens <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> token <span class="token operator">==</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            tokens<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                tokens<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">next_token</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">skip_whitespace</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> ch <span class="token punctuation">&#123;</span>            <span class="token comment">// Single-character tokens</span>            <span class="token char">'+'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Plus</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'-'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'*'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'/'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Slash</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">','</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Comma</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">';'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Semicolon</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'('</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">')'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'.'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Dot</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                        <span class="token comment">// Multi-character operators</span>            <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'&lt;'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Lte</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    <span class="token char">'>'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Neq</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Lt</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token char">'>'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Gte</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Gt</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token char">'!'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token char">'='</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Neq</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedChar</span><span class="token punctuation">(</span><span class="token char">'!'</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// String literals</span>            <span class="token char">'\''</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        <span class="token comment">// Identifiers and keywords</span>            ch <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_alphabetic</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ch <span class="token operator">==</span> <span class="token char">'_'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Numbers</span>            ch <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_numeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_number</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Unknown</span>            ch <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Unknown</span><span class="token punctuation">(</span>ch<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_string</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// Skip opening quote</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> value <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token char">'\''</span> <span class="token punctuation">&#123;</span>            value<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">::</span><span class="token class-name">UnterminatedString</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// Skip closing quote</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_identifier</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_alphanumeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ch <span class="token operator">==</span> <span class="token char">'_'</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span>start<span class="token punctuation">..</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Check if it's a keyword</span>        <span class="token keyword">let</span> token <span class="token operator">=</span> <span class="token keyword">match</span> value<span class="token punctuation">.</span><span class="token function">to_uppercase</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">as_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token string">"SELECT"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">,</span>            <span class="token string">"FROM"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">,</span>            <span class="token string">"WHERE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Where</span><span class="token punctuation">,</span>            <span class="token string">"INSERT"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Insert</span><span class="token punctuation">,</span>            <span class="token string">"UPDATE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Update</span><span class="token punctuation">,</span>            <span class="token string">"DELETE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Delete</span><span class="token punctuation">,</span>            <span class="token string">"CREATE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">,</span>            <span class="token string">"TABLE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Table</span><span class="token punctuation">,</span>            <span class="token string">"INDEX"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Index</span><span class="token punctuation">,</span>            <span class="token string">"PRIMARY"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">,</span>            <span class="token string">"KEY"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">,</span>            <span class="token string">"NULL"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">,</span>            <span class="token string">"TRUE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">True</span><span class="token punctuation">,</span>            <span class="token string">"FALSE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">False</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_number</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> is_float <span class="token operator">=</span> <span class="token boolean">false</span><span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_numeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> ch <span class="token operator">==</span> <span class="token char">'.'</span> <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span>is_float <span class="token punctuation">&#123;</span>                is_float <span class="token operator">=</span> <span class="token boolean">true</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span>start<span class="token punctuation">..</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> is_float <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">parse</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">parse</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">skip_whitespace</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">is_whitespace</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">current_char</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">char</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">advance</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="詞法分析器範例">詞法分析器範例</h3><pre class="language-none"><code class="language-none">Input: &quot;SELECT id, name FROM users WHERE balance &gt; 100&quot;Tokens:[    Select,    Identifier(&quot;id&quot;),    Comma,    Identifier(&quot;name&quot;),    From,    Identifier(&quot;users&quot;),    Where,    Identifier(&quot;balance&quot;),    Gt,    Integer(100),    Eof]</code></pre><hr /><h2 id="3-AST：抽象語法樹">3 AST：抽象語法樹</h2><h3 id="語句類型">語句類型</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Statement</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Select</span><span class="token punctuation">(</span><span class="token class-name">SelectStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">(</span><span class="token class-name">InsertStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span><span class="token punctuation">(</span><span class="token class-name">UpdateStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span><span class="token punctuation">(</span><span class="token class-name">DeleteStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CreateTable</span><span class="token punctuation">(</span><span class="token class-name">CreateTableStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CreateIndex</span><span class="token punctuation">(</span><span class="token class-name">CreateIndexStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">DropTable</span><span class="token punctuation">(</span><span class="token class-name">DropTableStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">DropIndex</span><span class="token punctuation">(</span><span class="token class-name">DropIndexStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="SELECT-語句">SELECT 語句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> distinct<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> projections<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> from<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableWithJoins</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> having<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">OrderByExpr</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> limit<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">SelectItem</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">UnnamedExpr</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">Ident</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Wildcard</span><span class="token punctuation">,</span>  <span class="token comment">// SELECT *</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Ident</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">char</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// "quoted" vs unquoted</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableWithJoins</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">TableFactor</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> joins<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Join</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TableFactor</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableAlias</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Subquery</span> <span class="token punctuation">&#123;</span> query<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableAlias</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableAlias</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Join</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> relation<span class="token punctuation">:</span> <span class="token class-name">TableFactor</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> join_operator<span class="token punctuation">:</span> <span class="token class-name">JoinOperator</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>    <span class="token class-name">RightOuter</span><span class="token punctuation">,</span>    <span class="token class-name">FullOuter</span><span class="token punctuation">,</span>    <span class="token class-name">Cross</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="DML-語句">DML 語句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// INSERT</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">InsertStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// UPDATE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">UpdateStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> assignments<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Assignment</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Assignment</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> column<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> value<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// DELETE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DeleteStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="DDL-語句">DDL 語句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// CREATE TABLE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CreateTableStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">ColumnDef</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> constraints<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TableConstraint</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> if_not_exists<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ColumnDef</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data_type<span class="token punctuation">:</span> <span class="token class-name">DataType</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">ColumnOption</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">DataType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Boolean</span><span class="token punctuation">,</span>    <span class="token class-name">SmallInt</span><span class="token punctuation">,</span>    <span class="token class-name">Integer</span><span class="token punctuation">,</span>    <span class="token class-name">BigInt</span><span class="token punctuation">,</span>    <span class="token class-name">Real</span><span class="token punctuation">,</span>    <span class="token class-name">Double</span><span class="token punctuation">,</span>    <span class="token class-name">Text</span><span class="token punctuation">,</span>    <span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// None = VARCHAR, Some(n) = VARCHAR(n)</span>    <span class="token class-name">Timestamp</span><span class="token punctuation">,</span>    <span class="token class-name">Date</span><span class="token punctuation">,</span>    <span class="token class-name">Bytea</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ColumnOption</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">NotNull</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>    <span class="token class-name">Default</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span><span class="token punctuation">,</span>    <span class="token class-name">PrimaryKey</span><span class="token punctuation">,</span>    <span class="token class-name">References</span> <span class="token punctuation">&#123;</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token class-name">Ident</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TableConstraint</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">PrimaryKey</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Check</span> <span class="token punctuation">&#123;</span> expression<span class="token punctuation">:</span> <span class="token class-name">Expression</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// CREATE INDEX</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CreateIndexStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">OrderByExpr</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> unique<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> if_not_exists<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="運算式：SQL-的核心">運算式：SQL 的核心</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Literals</span>    <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// table.column</span>    <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralFloat</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralString</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token keyword">bool</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>        <span class="token comment">// Operators</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>        op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">,</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Function calls</span>    <span class="token class-name">Function</span> <span class="token punctuation">&#123;</span>        name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>        args<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FunctionArg</span><span class="token operator">></span><span class="token punctuation">,</span>        distinct<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Subqueries</span>    <span class="token class-name">Subquery</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// CASE expressions</span>    <span class="token class-name">Case</span> <span class="token punctuation">&#123;</span>        operand<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>        conditions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">WhenClause</span><span class="token operator">></span><span class="token punctuation">,</span>        else_result<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// IN, BETWEEN, LIKE</span>    <span class="token class-name">InList</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        list<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">InSubquery</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        subquery<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Between</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        low<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        high<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Like</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        pattern<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// CAST</span>    <span class="token class-name">Cast</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        data_type<span class="token punctuation">:</span> <span class="token class-name">DataType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Parenthesized expressions</span>    <span class="token class-name">Nested</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">BinaryOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Multiply</span><span class="token punctuation">,</span>    <span class="token class-name">Divide</span><span class="token punctuation">,</span>    <span class="token class-name">Eq</span><span class="token punctuation">,</span>    <span class="token class-name">Neq</span><span class="token punctuation">,</span>    <span class="token class-name">Lt</span><span class="token punctuation">,</span>    <span class="token class-name">Lte</span><span class="token punctuation">,</span>    <span class="token class-name">Gt</span><span class="token punctuation">,</span>    <span class="token class-name">Gte</span><span class="token punctuation">,</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>    <span class="token class-name">Like</span><span class="token punctuation">,</span>    <span class="token class-name">NotLike</span><span class="token punctuation">,</span>    <span class="token class-name">Concat</span><span class="token punctuation">,</span>  <span class="token comment">// ||</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">UnaryOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Not</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WhenClause</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> result<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">FunctionArg</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Named</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span> arg<span class="token punctuation">:</span> <span class="token class-name">Expression</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Unnamed</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-解析器：遞歸下降">4 解析器：遞歸下降</h2><h3 id="解析器結構">解析器結構</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>sql_parser<span class="token punctuation">::</span>lexer<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">Lexer</span><span class="token punctuation">,</span> <span class="token class-name">Token</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>sql_parser<span class="token punctuation">::</span>ast<span class="token punctuation">::</span></span><span class="token operator">*</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> tokens<span class="token punctuation">,</span> pos<span class="token punctuation">:</span> <span class="token number">0</span> <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parse_statement</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_select</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_insert</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Insert</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_update</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Update</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_delete</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Delete</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_select</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// DISTINCT</span>        <span class="token keyword">let</span> distinct <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Distinct</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Projections</span>        <span class="token keyword">let</span> projections <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_projection_list</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// FROM clause</span>        <span class="token keyword">let</span> from <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_table_with_joins</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// WHERE clause</span>        <span class="token keyword">let</span> where_clause <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Where</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// GROUP BY</span>        <span class="token keyword">let</span> group_by <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Group</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">By</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_comma_separated</span><span class="token punctuation">(</span><span class="token class-name">Parser</span><span class="token punctuation">::</span>parse_expression<span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// HAVING</span>        <span class="token keyword">let</span> having <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Having</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// ORDER BY</span>        <span class="token keyword">let</span> order_by <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Order</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">By</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_order_by_list</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// LIMIT</span>        <span class="token keyword">let</span> limit <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Limit</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// OFFSET</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Offset</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>            distinct<span class="token punctuation">,</span>            projections<span class="token punctuation">,</span>            from<span class="token punctuation">,</span>            where_clause<span class="token punctuation">,</span>            group_by<span class="token punctuation">,</span>            having<span class="token punctuation">,</span>            order_by<span class="token punctuation">,</span>            limit<span class="token punctuation">,</span>            offset<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="帶優先級的運算式解析">帶優先級的運算式解析</h3><p><strong>挑戰：</strong> <code>1 + 2 * 3</code> 應該解析為 <code>1 + (2 * 3)</code>，而不是 <code>(1 + 2) * 3</code>。</p><p><strong>解決方案：</strong> Pratt 解析（優先級爬升）。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_expression</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_precedence</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> min_precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Parse left side (prefix)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_prefix</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Get operator precedence</span>            <span class="token keyword">let</span> op_precedence <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_operator_precedence</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token comment">// Stop if operator has lower precedence</span>            <span class="token keyword">if</span> op_precedence <span class="token operator">&lt;</span> min_precedence <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Parse operator and right side</span>            left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_infix</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> op_precedence<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_prefix</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Check for compound identifier (table.column)</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Dot</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>col<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>                        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span><span class="token macro property">vec!</span><span class="token punctuation">[</span>                            <span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> name<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                            <span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> col<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                        <span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> name<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralFloat</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralString</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">True</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">False</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token boolean">false</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>                    op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">,</span>                    expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>                    op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">,</span>                    expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Nested</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> <span class="token string">"*"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_infix</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> left<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span> precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> op_token <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> op_token <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Plus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Plus</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Multiply</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Slash</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Divide</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">And</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Or</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// ... more operators</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span>op_token<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Precedence levels</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Precedence</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Lowest</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>           <span class="token comment">// OR</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>          <span class="token comment">// AND</span>    <span class="token class-name">Comparison</span><span class="token punctuation">,</span>   <span class="token comment">// =, &lt;, >, &lt;=, >=, &lt;></span>    <span class="token class-name">Concat</span><span class="token punctuation">,</span>       <span class="token comment">// ||</span>    <span class="token class-name">AddSub</span><span class="token punctuation">,</span>       <span class="token comment">// +, -</span>    <span class="token class-name">MulDiv</span><span class="token punctuation">,</span>       <span class="token comment">// *, /</span>    <span class="token class-name">Unary</span><span class="token punctuation">,</span>        <span class="token comment">// NOT, -</span>    <span class="token class-name">Exponent</span><span class="token punctuation">,</span>     <span class="token comment">// ^</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Precedence</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Precedence</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Or</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">And</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Comparison</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Comparison</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Concat</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Concat</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">AddSub</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">AddSub</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">MulDiv</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">MulDiv</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="解析-DDL：CREATE-TABLE">解析 DDL：CREATE TABLE</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_create</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Table</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create_table</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Index</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create_index</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedTableOrIndex</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_create_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> if_not_exists <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">If</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Exists</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token boolean">true</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> name <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_object_name</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> columns <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> constraints <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Check for constraint</span>            <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Constraint</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> constraint <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_table_constraint</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                constraints<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>constraint<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> cols <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_comma_separated_identifiers</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                constraints<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">TableConstraint</span><span class="token punctuation">::</span><span class="token class-name">PrimaryKey</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> cols <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Column definition</span>                <span class="token keyword">let</span> column <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_column_def</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                columns<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>column<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>                        <span class="token keyword">if</span> <span class="token operator">!</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Comma</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">CreateTable</span><span class="token punctuation">(</span><span class="token class-name">CreateTableStatement</span> <span class="token punctuation">&#123;</span>            name<span class="token punctuation">,</span>            columns<span class="token punctuation">,</span>            constraints<span class="token punctuation">,</span>            if_not_exists<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_column_def</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">ColumnDef</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> name <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> data_type <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_data_type</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> options <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">NotNull</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Default</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Default</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">PrimaryKey</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Unique</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Unique</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">References</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> table <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_object_name</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> column <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">References</span> <span class="token punctuation">&#123;</span> table<span class="token punctuation">,</span> column <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">ColumnDef</span> <span class="token punctuation">&#123;</span>            name<span class="token punctuation">,</span>            data_type<span class="token punctuation">,</span>            options<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_data_type</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">DataType</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> name<span class="token punctuation">.</span><span class="token function">to_uppercase</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">as_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token string">"BOOLEAN"</span> <span class="token operator">|</span> <span class="token string">"BOOL"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Boolean</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"SMALLINT"</span> <span class="token operator">|</span> <span class="token string">"INT2"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">SmallInt</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"INTEGER"</span> <span class="token operator">|</span> <span class="token string">"INT"</span> <span class="token operator">|</span> <span class="token string">"INT4"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"BIGINT"</span> <span class="token operator">|</span> <span class="token string">"INT8"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">BigInt</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"REAL"</span> <span class="token operator">|</span> <span class="token string">"FLOAT4"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Real</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"DOUBLE"</span> <span class="token operator">|</span> <span class="token string">"FLOAT8"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Double</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"TEXT"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Text</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"VARCHAR"</span> <span class="token operator">|</span> <span class="token string">"CHARACTER VARYING"</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                            <span class="token keyword">let</span> size <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_integer_literal</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>size<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                    <span class="token string">"TIMESTAMP"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Timestamp</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"DATE"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Date</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"BYTEA"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Bytea</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnknownDataType</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedDataType</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-完整解析範例">5 完整解析範例</h2><h3 id="解析複雜查詢">解析複雜查詢</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Example usage</span><span class="token keyword">let</span> sql <span class="token operator">=</span> <span class="token string">r#"    SELECT         u.id,        u.name,        COUNT(o.id) as order_count,        SUM(o.amount) as total_amount    FROM users u    LEFT JOIN orders o ON u.id = o.user_id    WHERE u.balance > 100 AND u.created_at > '2026-01-01'    GROUP BY u.id, u.name    HAVING COUNT(o.id) > 5    ORDER BY total_amount DESC    LIMIT 10    OFFSET 5"#</span><span class="token punctuation">;</span><span class="token keyword">let</span> <span class="token keyword">mut</span> lexer <span class="token operator">=</span> <span class="token class-name">Lexer</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>sql<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">let</span> tokens <span class="token operator">=</span> lexer<span class="token punctuation">.</span><span class="token function">tokenize</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span><span class="token keyword">let</span> <span class="token keyword">mut</span> parser <span class="token operator">=</span> <span class="token class-name">Parser</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">let</span> ast <span class="token operator">=</span> parser<span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span><span class="token comment">// ast is now a SelectStatement with:</span><span class="token comment">// - 4 projections (id, name, COUNT, SUM)</span><span class="token comment">// - FROM users with LEFT JOIN orders</span><span class="token comment">// - WHERE clause with AND</span><span class="token comment">// - GROUP BY 2 columns</span><span class="token comment">// - HAVING clause</span><span class="token comment">// - ORDER BY with DESC</span><span class="token comment">// - LIMIT and OFFSET</span></code></pre><p><strong>結果 AST（簡化）：</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>    distinct<span class="token punctuation">:</span> <span class="token boolean">false</span><span class="token punctuation">,</span>    projections<span class="token punctuation">:</span> <span class="token punctuation">[</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"order_count"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"COUNT"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.id"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"order_count"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"SUM"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.amount"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"total_amount"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    from<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">TableWithJoins</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"users"</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"u"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        joins<span class="token punctuation">:</span> <span class="token punctuation">[</span>            <span class="token class-name">Join</span> <span class="token punctuation">&#123;</span>                relation<span class="token punctuation">:</span> <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"orders"</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"o"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                join_operator<span class="token punctuation">:</span> <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    where_clause<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.balance"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">100</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">And</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.created_at"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralString</span><span class="token punctuation">(</span><span class="token string">"2026-01-01"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    group_by<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.name"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    having<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        <span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"COUNT"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.id"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    order_by<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token class-name">OrderByExpr</span> <span class="token punctuation">&#123;</span> expr<span class="token punctuation">:</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"total_amount"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> asc<span class="token punctuation">:</span> <span class="token boolean">false</span> <span class="token punctuation">&#125;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    limit<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    offset<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-錯誤處理和恢復">6 錯誤處理和恢復</h2><h3 id="解析器錯誤">解析器錯誤</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/error.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Lexer errors</span>    <span class="token class-name">LexerError</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Syntax errors</span>    <span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedIdentifier</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedDataType</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedTableOrIndex</span><span class="token punctuation">,</span>        <span class="token comment">// Semantic errors (detected during parsing)</span>    <span class="token class-name">UnknownDataType</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">UnknownFunction</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">AmbiguousColumn</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Recovery</span>    <span class="token class-name">UnexpectedEof</span><span class="token punctuation">,</span>    <span class="token class-name">UnmatchedParenthesis</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">LexerError</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">UnterminatedString</span><span class="token punctuation">,</span>    <span class="token class-name">InvalidNumber</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">UnexpectedChar</span><span class="token punctuation">(</span><span class="token keyword">char</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">From</span><span class="token operator">&lt;</span><span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token keyword">for</span> <span class="token class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">from</span><span class="token punctuation">(</span>err<span class="token punctuation">:</span> <span class="token class-name">LexerError</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">LexerError</span><span class="token punctuation">(</span>err<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="錯誤恢復策略">錯誤恢復策略</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_statement_with_recovery</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start_pos <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span>stmt<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>stmt<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span>err<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Try to recover: skip to next semicolon or EOF</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">recover_to_statement_boundary</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                                <span class="token comment">// Return error with context</span>                <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">SyntaxError</span> <span class="token punctuation">&#123;</span>                    message<span class="token punctuation">:</span> <span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"Parse error: &#123;&#125;"</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span><span class="token punctuation">,</span>                    position<span class="token punctuation">:</span> start_pos<span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">recover_to_statement_boundary</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tokens<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Semicolon</span> <span class="token operator">|</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">return</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                _ <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-建構的挑戰">7 用 Rust 建構的挑戰</h2><h3 id="挑戰-1：遞歸類型">挑戰 1：遞歸類型</h3><p><strong>問題：</strong> AST 有遞歸類型（Expression 包含 Expression）。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile - infinite size</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>  <span class="token comment">// How big is this?</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：Box 用於間接</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - known size (pointer size)</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-2：生命週期註解">挑戰 2：生命週期註解</h3><p><strong>問題：</strong> 令牌從輸入借用，解析器需要參考它們。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile - lifetime issues</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Token</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Borrowed slice</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案： owned 令牌</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - owns its data</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Owned vector</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>權衡：</strong> 額外配置，但更簡單的生命週期。</p><hr /><h3 id="挑戰-3：錯誤類型複雜性">挑戰 3：錯誤類型複雜性</h3><p><strong>問題：</strong> 許多錯誤變體，難以模式匹配。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Unwieldy</span><span class="token keyword">match</span> err <span class="token punctuation">&#123;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token comment">// ... 50 more cases</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：Display 特性和上下文</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean</span><span class="token keyword">impl</span> <span class="token class-name">Display</span> <span class="token keyword">for</span> <span class="token class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">fmt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> f<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token namespace">std<span class="token punctuation">::</span>fmt<span class="token punctuation">::</span></span><span class="token class-name">Formatter</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token namespace">std<span class="token punctuation">::</span>fmt<span class="token punctuation">::</span></span><span class="token class-name">Result</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">write!</span><span class="token punctuation">(</span>f<span class="token punctuation">,</span> <span class="token string">"Unexpected token: &#123;&#125;"</span><span class="token punctuation">,</span> token<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">write!</span><span class="token punctuation">(</span>f<span class="token punctuation">,</span> <span class="token string">"Expected identifier"</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// ...</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage</span><span class="token keyword">let</span> result <span class="token operator">=</span> parser<span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">.</span><span class="token function">map_err</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>e<span class="token closure-punctuation punctuation">|</span></span> <span class="token macro property">eprintln!</span><span class="token punctuation">(</span><span class="token string">"Parse error at position &#123;&#125;: &#123;&#125;"</span><span class="token punctuation">,</span> parser<span class="token punctuation">.</span>pos<span class="token punctuation">,</span> e<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span></code></pre><hr /><h2 id="8-AI-如何加速這項工作">8 AI 如何加速這項工作</h2><h3 id="AI-做對了什麼">AI 做對了什麼</h3><table><thead><tr><th>任務</th><th>AI 貢獻</th></tr></thead><tbody><tr><td><strong>詞法分析器結構</strong></td><td>逐字元詞法分析模式</td></tr><tr><td><strong>優先級層級</strong></td><td>正確的運算符優先級順序</td></tr><tr><td><strong>AST 設計</strong></td><td>綜合的運算式變體</td></tr><tr><td><strong>錯誤類型</strong></td><td>錯誤情況的良好分類</td></tr></tbody></table><hr /><h3 id="AI-做錯了什麼">AI 做錯了什麼</h3><table><thead><tr><th>問題</th><th>發生什麼事</th></tr></thead><tbody><tr><td><strong>複合識別符</strong></td><td>初稿沒有處理 <code>table.column</code></td></tr><tr><td><strong>JOIN 解析</strong></td><td>忽略了 ON vs. USING 子句區別</td></tr><tr><td><strong>CASE 運算式</strong></td><td>產生不完整的 WHEN/THEN 處理</td></tr><tr><td><strong>優先級爬升</strong></td><td>建議沒有優先級的遞歸下降（對運算式錯誤）</td></tr></tbody></table><p><strong>模式：</strong> AI 處理常見情況良好。邊界情況（複合識別符、JOIN 變體）需要手動精煉。</p><hr /><h3 id="範例：除錯運算式解析">範例：除錯運算式解析</h3><p><strong>我問 AI 的問題：</strong></p><blockquote><p>“<code>1 + 2 * 3</code> 解析為 <code>(1 + 2) * 3</code>。為什麼？”</p></blockquote><p><strong>我學到的：</strong></p><ol><li>簡單遞歸下降不處理優先級</li><li>需要 Pratt 解析或優先級爬升</li><li>每個運算符需要優先級層級</li></ol><p><strong>結果：</strong> 實作基於優先級的解析：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">fn</span> <span class="token function-definition function">parse_expression</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">parse_precedence</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> min_precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_prefix</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> op_precedence <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_operator_precedence</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> op_precedence <span class="token operator">&lt;</span> min_precedence <span class="token punctuation">&#123;</span>            <span class="token keyword">break</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_infix</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> op_precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="總結：SQL-解析器一張圖">總結：SQL 解析器一張圖</h2><pre class="language-MERMAID_BASE64_610" data-language="MERMAID_BASE64_610"><code class="language-MERMAID_BASE64_610">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiTGV4ZXIiCiAgICAgICAgQVtTUUwgU3RyaW5nXSAtLT4gQltDaGFyYWN0ZXIgU3RyZWFtXQogICAgICAgIEIgLS0+IENbVG9rZW4gU3RyZWFtXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJQYXJzZXIiCiAgICAgICAgQyAtLT4gRFtwYXJzZV9zdGF0ZW1lbnRdCiAgICAgICAgRCAtLT4gRXtTdGF0ZW1lbnQgVHlwZT99CiAgICAgICAgRSAtLT58U0VMRUNUfCBGW3BhcnNlX3NlbGVjdF0KICAgICAgICBFIC0tPnxJTlNFUlR8IEdbcGFyc2VfaW5zZXJ0XQogICAgICAgIEUgLS0+fFVQREFURXwgSFtwYXJzZV91cGRhdGVdCiAgICAgICAgRSAtLT58REVMRVRFfCBJW3BhcnNlX2RlbGV0ZV0KICAgICAgICBFIC0tPnxDUkVBVEV8IEpbcGFyc2VfY3JlYXRlXQogICAgICAgIAogICAgICAgIEYgLS0+IEtbcGFyc2VfZXhwcmVzc2lvbl0KICAgICAgICBLIC0tPiBMW1ByYXR0IFBhcnNpbmddCiAgICAgICAgTCAtLT4gTVtFeHByZXNzaW9uIEFTVF0KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiQVNUIgogICAgICAgIE0gLS0+IE5bU2VsZWN0U3RhdGVtZW50XQogICAgICAgIE4gLS0+IE9bUHJvamVjdGlvbnNdCiAgICAgICAgTiAtLT4gUFtGcm9tL0pvaW5zXQogICAgICAgIE4gLS0+IFFbV2hlcmVdCiAgICAgICAgTiAtLT4gUltHcm91cCBCeS9IYXZpbmddCiAgICAgICAgTiAtLT4gU1tPcmRlciBCeS9MaW1pdF0KICAgIGVuZAogICAgCiAgICBzdHlsZSBBIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgQyBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIE4gZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAw</code></pre><p><strong>關鍵要點：</strong></p><table><thead><tr><th>概念</th><th>為什麼重要</th></tr></thead><tbody><tr><td><strong>詞法分析器</strong></td><td>將 SQL 詞法分析為有意義的單元</td></tr><tr><td><strong>遞歸下降</strong></td><td>自頂向下解析，每個語法規則一個函數</td></tr><tr><td><strong>Pratt 解析</strong></td><td>正確處理運算符優先級</td></tr><tr><td><strong>AST 設計</strong></td><td>用於查詢計劃的結構化表示</td></tr><tr><td><strong>錯誤恢復</strong></td><td>錯誤後繼續解析以獲得更好的訊息</td></tr><tr><td><strong>Box 用於遞歸</strong></td><td>Rust 需要已知大小的類型</td></tr></tbody></table><hr /><p><strong>進一步閱讀：</strong></p><ul><li>“Crafting Interpreters” by Robert Nystrom (free online) - 優秀的解析器教程</li><li>“Programming Language Pragmatics” by Scott - 編譯器設計基礎</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/tree/master/src/backend/parser"><code>src/backend/parser/</code></a></li><li>sqlparser-rs: <a href="https://github.com/sqlparser-rs/sqlparser-rs">github.com/sqlparser-rs/sqlparser-rs</a></li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第六部分：從頭建構 SQL 解析器。深入探討詞法分析、遞歸下降解析、DDL/DML/查詢的 AST 設計，以及運算符優先級處理。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>Database in Rust: Comprehensive SQL Parser (DDL, DML, Queries)</title>
    <link href="https://neo01.com/2026/03/Database-Rust-SQL-Parser/"/>
    <id>https://neo01.com/2026/03/Database-Rust-SQL-Parser/</id>
    <published>2026-03-05T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:49.572Z</updated>
    
    <content type="html"><![CDATA[<p>In <a href="/2026/03/Database-Rust-Wire-Protocol-Result-Set/">Part 5</a>, we built the PostgreSQL wire protocol. Clients can now connect and send queries. But there’s a problem.</p><p><strong>We receive SQL strings. Now what?</strong></p><pre class="language-none"><code class="language-none">Client: &quot;SELECT id, name FROM users WHERE balance &gt; 100 ORDER BY name LIMIT 10&quot;Server: ???</code></pre><p>We could use an existing parser (<code>sqlparser-rs</code>, <code>peg</code>, etc.). But building our own teaches us how SQL actually works.</p><p>Today: building a comprehensive SQL parser in Rust—from lexer to AST—for DDL, DML, and queries.</p><hr /><h2 id="1-Why-Build-a-SQL-Parser">1 Why Build a SQL Parser?</h2><h3 id="The-Alternatives">The Alternatives</h3><table><thead><tr><th>Approach</th><th>Pros</th><th>Cons</th></tr></thead><tbody><tr><td><strong>sqlparser-rs</strong></td><td>Production-ready, PostgreSQL dialect</td><td>Black box, hard to customize</td></tr><tr><td><strong>peg/lalrpop</strong></td><td>Generator handles grammar</td><td>Learning curve, debug complexity</td></tr><tr><td><strong>Hand-written</strong></td><td>Full control, educational</td><td>Time-consuming, error-prone</td></tr></tbody></table><p><strong>Vaultgres choice:</strong> Hand-written recursive descent parser.</p><p><strong>Why?</strong></p><table><thead><tr><th>Reason</th><th>Explanation</th></tr></thead><tbody><tr><td><strong>Learning</strong></td><td>Understand SQL grammar deeply</td></tr><tr><td><strong>Control</strong></td><td>Add custom extensions easily</td></tr><tr><td><strong>Error messages</strong></td><td>Better than generator defaults</td></tr><tr><td><strong>Integration</strong></td><td>Direct AST → execution plan</td></tr></tbody></table><hr /><h3 id="Parser-Architecture">Parser Architecture</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    SQL Parser Pipeline                       │├─────────────────────────────────────────────────────────────┤│                                                              ││  SQL String                                                  ││     │                                                        ││     ▼                                                        ││  ┌─────────────┐                                            ││  │   Lexer     │  Tokenize: &quot;SELECT&quot; → Token::SELECT        ││  │ (Tokenizer) │  &quot;123&quot; → Token::Integer(123)               ││  └──────┬──────┘                                            ││         │                                                    ││         ▼                                                    ││  ┌─────────────┐                                            ││  │   Parser    │  Recursive descent:                        ││  │             │  parse_statement() → parse_select() → ...  ││  └──────┬──────┘                                            ││         │                                                    ││         ▼                                                    ││  ┌─────────────┐                                            ││  │     AST     │  Structured representation:                ││  │             │  SelectStatement &#123; projections, from, ...&#125; ││  └─────────────┘                                            ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="2-Lexer-Tokenizing-SQL">2 Lexer: Tokenizing SQL</h2><h3 id="Token-Types">Token Types</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/lexer.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Token</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Keywords</span>    <span class="token class-name">Select</span><span class="token punctuation">,</span>    <span class="token class-name">From</span><span class="token punctuation">,</span>    <span class="token class-name">Where</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">,</span>    <span class="token class-name">Into</span><span class="token punctuation">,</span>    <span class="token class-name">Values</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span><span class="token punctuation">,</span>    <span class="token class-name">Set</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span><span class="token punctuation">,</span>    <span class="token class-name">Create</span><span class="token punctuation">,</span>    <span class="token class-name">Table</span><span class="token punctuation">,</span>    <span class="token class-name">Index</span><span class="token punctuation">,</span>    <span class="token class-name">On</span><span class="token punctuation">,</span>    <span class="token class-name">As</span><span class="token punctuation">,</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>    <span class="token class-name">Not</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>    <span class="token class-name">True</span><span class="token punctuation">,</span>    <span class="token class-name">False</span><span class="token punctuation">,</span>    <span class="token class-name">Primary</span><span class="token punctuation">,</span>    <span class="token class-name">Key</span><span class="token punctuation">,</span>    <span class="token class-name">References</span><span class="token punctuation">,</span>    <span class="token class-name">Default</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span><span class="token punctuation">,</span>    <span class="token class-name">Check</span><span class="token punctuation">,</span>    <span class="token class-name">Constraint</span><span class="token punctuation">,</span>    <span class="token class-name">Join</span><span class="token punctuation">,</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">Left</span><span class="token punctuation">,</span>    <span class="token class-name">Right</span><span class="token punctuation">,</span>    <span class="token class-name">Outer</span><span class="token punctuation">,</span>    <span class="token class-name">Order</span><span class="token punctuation">,</span>    <span class="token class-name">By</span><span class="token punctuation">,</span>    <span class="token class-name">Asc</span><span class="token punctuation">,</span>    <span class="token class-name">Desc</span><span class="token punctuation">,</span>    <span class="token class-name">Group</span><span class="token punctuation">,</span>    <span class="token class-name">Having</span><span class="token punctuation">,</span>    <span class="token class-name">Limit</span><span class="token punctuation">,</span>    <span class="token class-name">Offset</span><span class="token punctuation">,</span>    <span class="token class-name">Distinct</span><span class="token punctuation">,</span>        <span class="token comment">// Literals</span>    <span class="token class-name">Integer</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Float</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">String</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Operators</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Star</span><span class="token punctuation">,</span>    <span class="token class-name">Slash</span><span class="token punctuation">,</span>    <span class="token class-name">Eq</span><span class="token punctuation">,</span>    <span class="token class-name">Neq</span><span class="token punctuation">,</span>    <span class="token class-name">Lt</span><span class="token punctuation">,</span>    <span class="token class-name">Lte</span><span class="token punctuation">,</span>    <span class="token class-name">Gt</span><span class="token punctuation">,</span>    <span class="token class-name">Gte</span><span class="token punctuation">,</span>    <span class="token class-name">Arrow</span><span class="token punctuation">,</span>      <span class="token comment">// -></span>    <span class="token class-name">DoubleArrow</span><span class="token punctuation">,</span> <span class="token comment">// ->></span>        <span class="token comment">// Punctuation</span>    <span class="token class-name">Comma</span><span class="token punctuation">,</span>    <span class="token class-name">Semicolon</span><span class="token punctuation">,</span>    <span class="token class-name">LParen</span><span class="token punctuation">,</span>    <span class="token class-name">RParen</span><span class="token punctuation">,</span>    <span class="token class-name">Dot</span><span class="token punctuation">,</span>        <span class="token comment">// Special</span>    <span class="token class-name">Eof</span><span class="token punctuation">,</span>    <span class="token class-name">Unknown</span><span class="token punctuation">(</span><span class="token keyword">char</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Lexer-Implementation">Lexer Implementation</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/lexer.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Lexer</span> <span class="token punctuation">&#123;</span>    input<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">char</span><span class="token operator">></span><span class="token punctuation">,</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Lexer</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>input<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            input<span class="token punctuation">:</span> input<span class="token punctuation">.</span><span class="token function">chars</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            pos<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">tokenize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> tokens <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> token <span class="token operator">==</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            tokens<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                tokens<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">next_token</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">skip_whitespace</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> ch <span class="token punctuation">&#123;</span>            <span class="token comment">// Single-character tokens</span>            <span class="token char">'+'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Plus</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'-'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'*'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'/'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Slash</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">','</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Comma</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">';'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Semicolon</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'('</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">')'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'.'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Dot</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                        <span class="token comment">// Multi-character operators</span>            <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>            <span class="token char">'&lt;'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Lte</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    <span class="token char">'>'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Neq</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Lt</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token char">'>'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token char">'='</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Gte</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>                    _ <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Gt</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token char">'!'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">==</span> <span class="token char">'='</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Neq</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedChar</span><span class="token punctuation">(</span><span class="token char">'!'</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// String literals</span>            <span class="token char">'\''</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                        <span class="token comment">// Identifiers and keywords</span>            ch <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_alphabetic</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ch <span class="token operator">==</span> <span class="token char">'_'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Numbers</span>            ch <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_numeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_number</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Unknown</span>            ch <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Unknown</span><span class="token punctuation">(</span>ch<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_string</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// Skip opening quote</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> value <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token char">'\''</span> <span class="token punctuation">&#123;</span>            value<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">::</span><span class="token class-name">UnterminatedString</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// Skip closing quote</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_identifier</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_alphanumeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">||</span> ch <span class="token operator">==</span> <span class="token char">'_'</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span>start<span class="token punctuation">..</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Check if it's a keyword</span>        <span class="token keyword">let</span> token <span class="token operator">=</span> <span class="token keyword">match</span> value<span class="token punctuation">.</span><span class="token function">to_uppercase</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">as_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token string">"SELECT"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">,</span>            <span class="token string">"FROM"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">,</span>            <span class="token string">"WHERE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Where</span><span class="token punctuation">,</span>            <span class="token string">"INSERT"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Insert</span><span class="token punctuation">,</span>            <span class="token string">"UPDATE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Update</span><span class="token punctuation">,</span>            <span class="token string">"DELETE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Delete</span><span class="token punctuation">,</span>            <span class="token string">"CREATE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">,</span>            <span class="token string">"TABLE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Table</span><span class="token punctuation">,</span>            <span class="token string">"INDEX"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Index</span><span class="token punctuation">,</span>            <span class="token string">"PRIMARY"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">,</span>            <span class="token string">"KEY"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">,</span>            <span class="token string">"NULL"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">,</span>            <span class="token string">"TRUE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">True</span><span class="token punctuation">,</span>            <span class="token string">"FALSE"</span> <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">False</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">read_number</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span> <span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> is_float <span class="token operator">=</span> <span class="token boolean">false</span><span class="token punctuation">;</span>                <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> ch <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> ch<span class="token punctuation">.</span><span class="token function">is_numeric</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> ch <span class="token operator">==</span> <span class="token char">'.'</span> <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span>is_float <span class="token punctuation">&#123;</span>                is_float <span class="token operator">=</span> <span class="token boolean">true</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">let</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span>start<span class="token punctuation">..</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> is_float <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">parse</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">parse</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">skip_whitespace</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_char</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">is_whitespace</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">current_char</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">char</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>input<span class="token punctuation">[</span><span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">]</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">advance</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Lexer-Example">Lexer Example</h3><pre class="language-none"><code class="language-none">Input: &quot;SELECT id, name FROM users WHERE balance &gt; 100&quot;Tokens:[    Select,    Identifier(&quot;id&quot;),    Comma,    Identifier(&quot;name&quot;),    From,    Identifier(&quot;users&quot;),    Where,    Identifier(&quot;balance&quot;),    Gt,    Integer(100),    Eof]</code></pre><hr /><h2 id="3-AST-Abstract-Syntax-Tree">3 AST: Abstract Syntax Tree</h2><h3 id="Statement-Types">Statement Types</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Statement</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Select</span><span class="token punctuation">(</span><span class="token class-name">SelectStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">(</span><span class="token class-name">InsertStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span><span class="token punctuation">(</span><span class="token class-name">UpdateStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span><span class="token punctuation">(</span><span class="token class-name">DeleteStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CreateTable</span><span class="token punctuation">(</span><span class="token class-name">CreateTableStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CreateIndex</span><span class="token punctuation">(</span><span class="token class-name">CreateIndexStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">DropTable</span><span class="token punctuation">(</span><span class="token class-name">DropTableStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">DropIndex</span><span class="token punctuation">(</span><span class="token class-name">DropIndexStatement</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="SELECT-Statement">SELECT Statement</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> distinct<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> projections<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> from<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableWithJoins</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> group_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> having<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> order_by<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">OrderByExpr</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> limit<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">SelectItem</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">UnnamedExpr</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">Ident</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Wildcard</span><span class="token punctuation">,</span>  <span class="token comment">// SELECT *</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Ident</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> value<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">char</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// "quoted" vs unquoted</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableWithJoins</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">TableFactor</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> joins<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Join</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TableFactor</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableAlias</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Subquery</span> <span class="token punctuation">&#123;</span> query<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TableAlias</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TableAlias</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Join</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> relation<span class="token punctuation">:</span> <span class="token class-name">TableFactor</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> join_operator<span class="token punctuation">:</span> <span class="token class-name">JoinOperator</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">JoinOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Inner</span><span class="token punctuation">,</span>    <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>    <span class="token class-name">RightOuter</span><span class="token punctuation">,</span>    <span class="token class-name">FullOuter</span><span class="token punctuation">,</span>    <span class="token class-name">Cross</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="DML-Statements">DML Statements</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// INSERT</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">InsertStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// UPDATE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">UpdateStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> assignments<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Assignment</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Assignment</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> column<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> value<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// DELETE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DeleteStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> where_clause<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> returning<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">SelectItem</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="DDL-Statements">DDL Statements</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// CREATE TABLE</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CreateTableStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">ColumnDef</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> constraints<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TableConstraint</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> if_not_exists<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ColumnDef</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data_type<span class="token punctuation">:</span> <span class="token class-name">DataType</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">ColumnOption</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">DataType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Boolean</span><span class="token punctuation">,</span>    <span class="token class-name">SmallInt</span><span class="token punctuation">,</span>    <span class="token class-name">Integer</span><span class="token punctuation">,</span>    <span class="token class-name">BigInt</span><span class="token punctuation">,</span>    <span class="token class-name">Real</span><span class="token punctuation">,</span>    <span class="token class-name">Double</span><span class="token punctuation">,</span>    <span class="token class-name">Text</span><span class="token punctuation">,</span>    <span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// None = VARCHAR, Some(n) = VARCHAR(n)</span>    <span class="token class-name">Timestamp</span><span class="token punctuation">,</span>    <span class="token class-name">Date</span><span class="token punctuation">,</span>    <span class="token class-name">Bytea</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ColumnOption</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">NotNull</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>    <span class="token class-name">Default</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span><span class="token punctuation">,</span>    <span class="token class-name">PrimaryKey</span><span class="token punctuation">,</span>    <span class="token class-name">References</span> <span class="token punctuation">&#123;</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span> column<span class="token punctuation">:</span> <span class="token class-name">Ident</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TableConstraint</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">PrimaryKey</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Unique</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Check</span> <span class="token punctuation">&#123;</span> expression<span class="token punctuation">:</span> <span class="token class-name">Expression</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// CREATE INDEX</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">CreateIndexStatement</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> table<span class="token punctuation">:</span> <span class="token class-name">ObjectName</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">OrderByExpr</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> unique<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> if_not_exists<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Expressions-The-Heart-of-SQL">Expressions: The Heart of SQL</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/ast.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Literals</span>    <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Ident</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// table.column</span>    <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token keyword">i64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralFloat</span><span class="token punctuation">(</span><span class="token keyword">f64</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralString</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token keyword">bool</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Null</span><span class="token punctuation">,</span>        <span class="token comment">// Operators</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>        op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">,</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Function calls</span>    <span class="token class-name">Function</span> <span class="token punctuation">&#123;</span>        name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span>        args<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FunctionArg</span><span class="token operator">></span><span class="token punctuation">,</span>        distinct<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Subqueries</span>    <span class="token class-name">Subquery</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// CASE expressions</span>    <span class="token class-name">Case</span> <span class="token punctuation">&#123;</span>        operand<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>        conditions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">WhenClause</span><span class="token operator">></span><span class="token punctuation">,</span>        else_result<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">>></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// IN, BETWEEN, LIKE</span>    <span class="token class-name">InList</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        list<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">InSubquery</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        subquery<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Between</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        low<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        high<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Like</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        pattern<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        negated<span class="token punctuation">:</span> <span class="token keyword">bool</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// CAST</span>    <span class="token class-name">Cast</span> <span class="token punctuation">&#123;</span>        expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        data_type<span class="token punctuation">:</span> <span class="token class-name">DataType</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token comment">// Parenthesized expressions</span>    <span class="token class-name">Nested</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">BinaryOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Multiply</span><span class="token punctuation">,</span>    <span class="token class-name">Divide</span><span class="token punctuation">,</span>    <span class="token class-name">Eq</span><span class="token punctuation">,</span>    <span class="token class-name">Neq</span><span class="token punctuation">,</span>    <span class="token class-name">Lt</span><span class="token punctuation">,</span>    <span class="token class-name">Lte</span><span class="token punctuation">,</span>    <span class="token class-name">Gt</span><span class="token punctuation">,</span>    <span class="token class-name">Gte</span><span class="token punctuation">,</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>    <span class="token class-name">Like</span><span class="token punctuation">,</span>    <span class="token class-name">NotLike</span><span class="token punctuation">,</span>    <span class="token class-name">Concat</span><span class="token punctuation">,</span>  <span class="token comment">// ||</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">UnaryOperator</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Plus</span><span class="token punctuation">,</span>    <span class="token class-name">Minus</span><span class="token punctuation">,</span>    <span class="token class-name">Not</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WhenClause</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> condition<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> result<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">FunctionArg</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Named</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token class-name">Ident</span><span class="token punctuation">,</span> arg<span class="token punctuation">:</span> <span class="token class-name">Expression</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Unnamed</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-Parser-Recursive-Descent">4 Parser: Recursive Descent</h2><h3 id="Parser-Structure">Parser Structure</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>sql_parser<span class="token punctuation">::</span>lexer<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">Lexer</span><span class="token punctuation">,</span> <span class="token class-name">Token</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>sql_parser<span class="token punctuation">::</span>ast<span class="token punctuation">::</span></span><span class="token operator">*</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> tokens<span class="token punctuation">,</span> pos<span class="token punctuation">:</span> <span class="token number">0</span> <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parse_statement</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_select</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_insert</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Insert</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_update</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Update</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_delete</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Delete</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_select</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">SelectStatement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// DISTINCT</span>        <span class="token keyword">let</span> distinct <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Distinct</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Projections</span>        <span class="token keyword">let</span> projections <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_projection_list</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// FROM clause</span>        <span class="token keyword">let</span> from <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_table_with_joins</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// WHERE clause</span>        <span class="token keyword">let</span> where_clause <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Where</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// GROUP BY</span>        <span class="token keyword">let</span> group_by <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Group</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">By</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_comma_separated</span><span class="token punctuation">(</span><span class="token class-name">Parser</span><span class="token punctuation">::</span>parse_expression<span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// HAVING</span>        <span class="token keyword">let</span> having <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Having</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// ORDER BY</span>        <span class="token keyword">let</span> order_by <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Order</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">By</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_order_by_list</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// LIMIT</span>        <span class="token keyword">let</span> limit <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Limit</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token comment">// OFFSET</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Offset</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">None</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>            distinct<span class="token punctuation">,</span>            projections<span class="token punctuation">,</span>            from<span class="token punctuation">,</span>            where_clause<span class="token punctuation">,</span>            group_by<span class="token punctuation">,</span>            having<span class="token punctuation">,</span>            order_by<span class="token punctuation">,</span>            limit<span class="token punctuation">,</span>            offset<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Expression-Parsing-with-Precedence">Expression Parsing with Precedence</h3><p><strong>The challenge:</strong> <code>1 + 2 * 3</code> should parse as <code>1 + (2 * 3)</code>, not <code>(1 + 2) * 3</code>.</p><p><strong>Solution:</strong> Pratt parsing (precedence climbing).</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_expression</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_precedence</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> min_precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Parse left side (prefix)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_prefix</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Get operator precedence</span>            <span class="token keyword">let</span> op_precedence <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_operator_precedence</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token comment">// Stop if operator has lower precedence</span>            <span class="token keyword">if</span> op_precedence <span class="token operator">&lt;</span> min_precedence <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>                        <span class="token comment">// Parse operator and right side</span>            left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_infix</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> op_precedence<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_prefix</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Check for compound identifier (table.column)</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Dot</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>col<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>                        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">CompoundIdentifier</span><span class="token punctuation">(</span><span class="token macro property">vec!</span><span class="token punctuation">[</span>                            <span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> name<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                            <span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> col<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                        <span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                        <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span><span class="token punctuation">)</span>                    <span class="token punctuation">&#125;</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> name<span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Float</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralFloat</span><span class="token punctuation">(</span>n<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">String</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralString</span><span class="token punctuation">(</span>s<span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">True</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">False</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">LiteralBoolean</span><span class="token punctuation">(</span><span class="token boolean">false</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>                    op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">,</span>                    expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">UnaryOp</span> <span class="token punctuation">&#123;</span>                    op<span class="token punctuation">:</span> <span class="token class-name">UnaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">,</span>                    expr<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Nested</span><span class="token punctuation">(</span><span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token class-name">Ident</span> <span class="token punctuation">&#123;</span> value<span class="token punctuation">:</span> <span class="token string">"*"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> quote_style<span class="token punctuation">:</span> <span class="token class-name">None</span> <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_infix</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> left<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span> precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> op_token <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">match</span> op_token <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Plus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Plus</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Minus</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Minus</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Star</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Multiply</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Slash</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Divide</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eq</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Eq</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">And</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> right <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span>precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Expression</span><span class="token punctuation">::</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>                    left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">,</span>                    op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">::</span><span class="token class-name">Or</span><span class="token punctuation">,</span>                    right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>right<span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// ... more operators</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span>op_token<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Precedence levels</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy, PartialEq, PartialOrd)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Precedence</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Lowest</span><span class="token punctuation">,</span>    <span class="token class-name">Or</span><span class="token punctuation">,</span>           <span class="token comment">// OR</span>    <span class="token class-name">And</span><span class="token punctuation">,</span>          <span class="token comment">// AND</span>    <span class="token class-name">Comparison</span><span class="token punctuation">,</span>   <span class="token comment">// =, &lt;, >, &lt;=, >=, &lt;></span>    <span class="token class-name">Concat</span><span class="token punctuation">,</span>       <span class="token comment">// ||</span>    <span class="token class-name">AddSub</span><span class="token punctuation">,</span>       <span class="token comment">// +, -</span>    <span class="token class-name">MulDiv</span><span class="token punctuation">,</span>       <span class="token comment">// *, /</span>    <span class="token class-name">Unary</span><span class="token punctuation">,</span>        <span class="token comment">// NOT, -</span>    <span class="token class-name">Exponent</span><span class="token punctuation">,</span>     <span class="token comment">// ^</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Precedence</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Precedence</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Or</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Or</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">And</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">And</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Comparison</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Comparison</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Concat</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Concat</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">AddSub</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">AddSub</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">MulDiv</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">MulDiv</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Unary</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span><span class="token punctuation">,</span>            <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span> <span class="token operator">=></span> <span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Exponent</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Parsing-DDL-CREATE-TABLE">Parsing DDL: CREATE TABLE</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/parser.rs</span><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_create</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Create</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Table</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create_table</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Index</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_create_index</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedTableOrIndex</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_create_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> if_not_exists <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">If</span><span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Exists</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token boolean">true</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> name <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_object_name</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> columns <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> constraints <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Check for constraint</span>            <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Constraint</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> constraint <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_table_constraint</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                constraints<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>constraint<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> cols <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_comma_separated_identifiers</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                constraints<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">TableConstraint</span><span class="token punctuation">::</span><span class="token class-name">PrimaryKey</span> <span class="token punctuation">&#123;</span> columns<span class="token punctuation">:</span> cols <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Column definition</span>                <span class="token keyword">let</span> column <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_column_def</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                columns<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>column<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>                        <span class="token keyword">if</span> <span class="token operator">!</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Comma</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Statement</span><span class="token punctuation">::</span><span class="token class-name">CreateTable</span><span class="token punctuation">(</span><span class="token class-name">CreateTableStatement</span> <span class="token punctuation">&#123;</span>            name<span class="token punctuation">,</span>            columns<span class="token punctuation">,</span>            constraints<span class="token punctuation">,</span>            if_not_exists<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_column_def</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">ColumnDef</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> name <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> data_type <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_data_type</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> options <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Not</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">NotNull</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Null</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Default</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> expr <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_expression</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Default</span><span class="token punctuation">(</span>expr<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Primary</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Key</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">PrimaryKey</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Unique</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">Unique</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">References</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> table <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_object_name</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> column <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_identifier</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                options<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">ColumnOption</span><span class="token punctuation">::</span><span class="token class-name">References</span> <span class="token punctuation">&#123;</span> table<span class="token punctuation">,</span> column <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">break</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">ColumnDef</span> <span class="token punctuation">&#123;</span>            name<span class="token punctuation">,</span>            data_type<span class="token punctuation">,</span>            options<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">parse_data_type</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">DataType</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">next_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">match</span> name<span class="token punctuation">.</span><span class="token function">to_uppercase</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">as_str</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token string">"BOOLEAN"</span> <span class="token operator">|</span> <span class="token string">"BOOL"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Boolean</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"SMALLINT"</span> <span class="token operator">|</span> <span class="token string">"INT2"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">SmallInt</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"INTEGER"</span> <span class="token operator">|</span> <span class="token string">"INT"</span> <span class="token operator">|</span> <span class="token string">"INT4"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Integer</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"BIGINT"</span> <span class="token operator">|</span> <span class="token string">"INT8"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">BigInt</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"REAL"</span> <span class="token operator">|</span> <span class="token string">"FLOAT4"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Real</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"DOUBLE"</span> <span class="token operator">|</span> <span class="token string">"FLOAT8"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Double</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"TEXT"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Text</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"VARCHAR"</span> <span class="token operator">|</span> <span class="token string">"CHARACTER VARYING"</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">consume_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">LParen</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                            <span class="token keyword">let</span> size <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_integer_literal</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">expect_token</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">RParen</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>size<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Varchar</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                        <span class="token punctuation">&#125;</span>                    <span class="token punctuation">&#125;</span>                    <span class="token string">"TIMESTAMP"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Timestamp</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"DATE"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Date</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    <span class="token string">"BYTEA"</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">DataType</span><span class="token punctuation">::</span><span class="token class-name">Bytea</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                    _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnknownDataType</span><span class="token punctuation">(</span>name<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedDataType</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-Complete-Parsing-Example">5 Complete Parsing Example</h2><h3 id="Parsing-a-Complex-Query">Parsing a Complex Query</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Example usage</span><span class="token keyword">let</span> sql <span class="token operator">=</span> <span class="token string">r#"    SELECT         u.id,        u.name,        COUNT(o.id) as order_count,        SUM(o.amount) as total_amount    FROM users u    LEFT JOIN orders o ON u.id = o.user_id    WHERE u.balance > 100 AND u.created_at > '2026-01-01'    GROUP BY u.id, u.name    HAVING COUNT(o.id) > 5    ORDER BY total_amount DESC    LIMIT 10    OFFSET 5"#</span><span class="token punctuation">;</span><span class="token keyword">let</span> <span class="token keyword">mut</span> lexer <span class="token operator">=</span> <span class="token class-name">Lexer</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>sql<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">let</span> tokens <span class="token operator">=</span> lexer<span class="token punctuation">.</span><span class="token function">tokenize</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span><span class="token keyword">let</span> <span class="token keyword">mut</span> parser <span class="token operator">=</span> <span class="token class-name">Parser</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">let</span> ast <span class="token operator">=</span> parser<span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span><span class="token comment">// ast is now a SelectStatement with:</span><span class="token comment">// - 4 projections (id, name, COUNT, SUM)</span><span class="token comment">// - FROM users with LEFT JOIN orders</span><span class="token comment">// - WHERE clause with AND</span><span class="token comment">// - GROUP BY 2 columns</span><span class="token comment">// - HAVING clause</span><span class="token comment">// - ORDER BY with DESC</span><span class="token comment">// - LIMIT and OFFSET</span></code></pre><p><strong>Resulting AST (simplified):</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">SelectStatement</span> <span class="token punctuation">&#123;</span>    distinct<span class="token punctuation">:</span> <span class="token boolean">false</span><span class="token punctuation">,</span>    projections<span class="token punctuation">:</span> <span class="token punctuation">[</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"order_count"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"COUNT"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.id"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"order_count"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token class-name">ExprWithAlias</span><span class="token punctuation">(</span><span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"SUM"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.amount"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">"total_amount"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    from<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">TableWithJoins</span> <span class="token punctuation">&#123;</span>        table<span class="token punctuation">:</span> <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"users"</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"u"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        joins<span class="token punctuation">:</span> <span class="token punctuation">[</span>            <span class="token class-name">Join</span> <span class="token punctuation">&#123;</span>                relation<span class="token punctuation">:</span> <span class="token class-name">Table</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"orders"</span><span class="token punctuation">,</span> alias<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"o"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>                join_operator<span class="token punctuation">:</span> <span class="token class-name">LeftOuter</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">]</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    where_clause<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.balance"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">100</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">And</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.created_at"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralString</span><span class="token punctuation">(</span><span class="token string">"2026-01-01"</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    group_by<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"u.name"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    having<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        <span class="token class-name">Function</span><span class="token punctuation">(</span><span class="token string">"COUNT"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"o.id"</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Gt</span><span class="token punctuation">,</span> <span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    order_by<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token class-name">OrderByExpr</span> <span class="token punctuation">&#123;</span> expr<span class="token punctuation">:</span> <span class="token class-name">Identifier</span><span class="token punctuation">(</span><span class="token string">"total_amount"</span><span class="token punctuation">)</span><span class="token punctuation">,</span> asc<span class="token punctuation">:</span> <span class="token boolean">false</span> <span class="token punctuation">&#125;</span><span class="token punctuation">]</span><span class="token punctuation">,</span>    limit<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    offset<span class="token punctuation">:</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">LiteralNumber</span><span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-Error-Handling-and-Recovery">6 Error Handling and Recovery</h2><h3 id="Parser-Errors">Parser Errors</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/sql_parser/error.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Lexer errors</span>    <span class="token class-name">LexerError</span><span class="token punctuation">(</span><span class="token class-name">LexerError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Syntax errors</span>    <span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedIdentifier</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedDataType</span><span class="token punctuation">,</span>    <span class="token class-name">ExpectedTableOrIndex</span><span class="token punctuation">,</span>        <span class="token comment">// Semantic errors (detected during parsing)</span>    <span class="token class-name">UnknownDataType</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">UnknownFunction</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">AmbiguousColumn</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Recovery</span>    <span class="token class-name">UnexpectedEof</span><span class="token punctuation">,</span>    <span class="token class-name">UnmatchedParenthesis</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, PartialEq)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">LexerError</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">UnterminatedString</span><span class="token punctuation">,</span>    <span class="token class-name">InvalidNumber</span><span class="token punctuation">(</span><span class="token class-name">String</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">UnexpectedChar</span><span class="token punctuation">(</span><span class="token keyword">char</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">From</span><span class="token operator">&lt;</span><span class="token class-name">LexerError</span><span class="token operator">></span> <span class="token keyword">for</span> <span class="token class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">from</span><span class="token punctuation">(</span>err<span class="token punctuation">:</span> <span class="token class-name">LexerError</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">LexerError</span><span class="token punctuation">(</span>err<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Error-Recovery-Strategy">Error Recovery Strategy</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">Parser</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">parse_statement_with_recovery</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Statement</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> start_pos <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos<span class="token punctuation">;</span>                <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span>stmt<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Ok</span><span class="token punctuation">(</span>stmt<span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span>err<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Try to recover: skip to next semicolon or EOF</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">recover_to_statement_boundary</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                                <span class="token comment">// Return error with context</span>                <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">SyntaxError</span> <span class="token punctuation">&#123;</span>                    message<span class="token punctuation">:</span> <span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"Parse error: &#123;&#125;"</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span><span class="token punctuation">,</span>                    position<span class="token punctuation">:</span> start_pos<span class="token punctuation">,</span>                <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">fn</span> <span class="token function-definition function">recover_to_statement_boundary</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">while</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pos <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>tokens<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">peek_token</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Semicolon</span> <span class="token operator">|</span> <span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Eof</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    <span class="token keyword">return</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                _ <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">advance</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-Challenges-Building-in-Rust">7 Challenges Building in Rust</h2><h3 id="Challenge-1-Recursive-Types">Challenge 1: Recursive Types</h3><p><strong>Problem:</strong> AST has recursive types (Expression contains Expression).</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile - infinite size</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>  <span class="token comment">// How big is this?</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Expression</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Box for indirection</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - known size (pointer size)</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Expression</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">BinaryOp</span> <span class="token punctuation">&#123;</span>        left<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>        op<span class="token punctuation">:</span> <span class="token class-name">BinaryOperator</span><span class="token punctuation">,</span>        right<span class="token punctuation">:</span> <span class="token class-name">Box</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-2-Lifetime-Annotations">Challenge 2: Lifetime Annotations</h3><p><strong>Problem:</strong> Tokens borrowed from input, parser needs to reference them.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile - lifetime issues</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token class-name">Token</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Borrowed slice</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Owned tokens</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works - owns its data</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Parser</span> <span class="token punctuation">&#123;</span>    tokens<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Token</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Owned vector</span>    pos<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Trade-off:</strong> Extra allocation, but simpler lifetimes.</p><hr /><h3 id="Challenge-3-Error-Type-Complexity">Challenge 3: Error Type Complexity</h3><p><strong>Problem:</strong> Many error variants, hard to pattern match.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Unwieldy</span><span class="token keyword">match</span> err <span class="token punctuation">&#123;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">Select</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span><span class="token class-name">Token</span><span class="token punctuation">::</span><span class="token class-name">From</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>    <span class="token comment">// ... 50 more cases</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Display trait and context</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean</span><span class="token keyword">impl</span> <span class="token class-name">Display</span> <span class="token keyword">for</span> <span class="token class-name">ParserError</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">fmt</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> f<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token namespace">std<span class="token punctuation">::</span>fmt<span class="token punctuation">::</span></span><span class="token class-name">Formatter</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token namespace">std<span class="token punctuation">::</span>fmt<span class="token punctuation">::</span></span><span class="token class-name">Result</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">UnexpectedToken</span><span class="token punctuation">(</span>token<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">write!</span><span class="token punctuation">(</span>f<span class="token punctuation">,</span> <span class="token string">"Unexpected token: &#123;&#125;"</span><span class="token punctuation">,</span> token<span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">ParserError</span><span class="token punctuation">::</span><span class="token class-name">ExpectedIdentifier</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token macro property">write!</span><span class="token punctuation">(</span>f<span class="token punctuation">,</span> <span class="token string">"Expected identifier"</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// ...</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage</span><span class="token keyword">let</span> result <span class="token operator">=</span> parser<span class="token punctuation">.</span><span class="token function">parse_statement</span><span class="token punctuation">(</span><span class="token punctuation">)</span>    <span class="token punctuation">.</span><span class="token function">map_err</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>e<span class="token closure-punctuation punctuation">|</span></span> <span class="token macro property">eprintln!</span><span class="token punctuation">(</span><span class="token string">"Parse error at position &#123;&#125;: &#123;&#125;"</span><span class="token punctuation">,</span> parser<span class="token punctuation">.</span>pos<span class="token punctuation">,</span> e<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span></code></pre><hr /><h2 id="8-How-AI-Accelerated-This">8 How AI Accelerated This</h2><h3 id="What-AI-Got-Right">What AI Got Right</h3><table><thead><tr><th>Task</th><th>AI Contribution</th></tr></thead><tbody><tr><td><strong>Lexer structure</strong></td><td>Character-by-character tokenization pattern</td></tr><tr><td><strong>Precedence levels</strong></td><td>Correct operator precedence ordering</td></tr><tr><td><strong>AST design</strong></td><td>Comprehensive expression variants</td></tr><tr><td><strong>Error types</strong></td><td>Good categorization of error cases</td></tr></tbody></table><hr /><h3 id="What-AI-Got-Wrong">What AI Got Wrong</h3><table><thead><tr><th>Issue</th><th>What Happened</th></tr></thead><tbody><tr><td><strong>Compound identifiers</strong></td><td>First draft didn’t handle <code>table.column</code></td></tr><tr><td><strong>JOIN parsing</strong></td><td>Missed ON vs. USING clause distinction</td></tr><tr><td><strong>CASE expressions</strong></td><td>Generated incomplete WHEN/THEN handling</td></tr><tr><td><strong>Precedence climbing</strong></td><td>Suggested recursive descent without precedence (wrong for expressions)</td></tr></tbody></table><p><strong>Pattern:</strong> AI handles common cases well. Edge cases (compound identifiers, JOIN variants) require manual refinement.</p><hr /><h3 id="Example-Debugging-Expression-Parsing">Example: Debugging Expression Parsing</h3><p><strong>My question to AI:</strong></p><blockquote><p>“<code>1 + 2 * 3</code> parses as <code>(1 + 2) * 3</code>. Why?”</p></blockquote><p><strong>What I learned:</strong></p><ol><li>Simple recursive descent doesn’t handle precedence</li><li>Need Pratt parsing or precedence climbing</li><li>Each operator needs a precedence level</li></ol><p><strong>Result:</strong> Implemented precedence-based parsing:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">fn</span> <span class="token function-definition function">parse_expression</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_precedence</span><span class="token punctuation">(</span><span class="token class-name">Precedence</span><span class="token punctuation">::</span><span class="token class-name">Lowest</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">parse_precedence</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> min_precedence<span class="token punctuation">:</span> <span class="token class-name">Precedence</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Expression</span><span class="token punctuation">,</span> <span class="token class-name">ParserError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_prefix</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> op_precedence <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_operator_precedence</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> op_precedence <span class="token operator">&lt;</span> min_precedence <span class="token punctuation">&#123;</span>            <span class="token keyword">break</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        left <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">parse_infix</span><span class="token punctuation">(</span>left<span class="token punctuation">,</span> op_precedence<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>left<span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="Summary-SQL-Parser-in-One-Diagram">Summary: SQL Parser in One Diagram</h2><pre class="language-MERMAID_BASE64_611" data-language="MERMAID_BASE64_611"><code class="language-MERMAID_BASE64_611">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiTGV4ZXIiCiAgICAgICAgQVtTUUwgU3RyaW5nXSAtLT4gQltDaGFyYWN0ZXIgU3RyZWFtXQogICAgICAgIEIgLS0+IENbVG9rZW4gU3RyZWFtXQogICAgZW5kCiAgICAKICAgIHN1YmdyYXBoICJQYXJzZXIiCiAgICAgICAgQyAtLT4gRFtwYXJzZV9zdGF0ZW1lbnRdCiAgICAgICAgRCAtLT4gRXtTdGF0ZW1lbnQgVHlwZT99CiAgICAgICAgRSAtLT58U0VMRUNUfCBGW3BhcnNlX3NlbGVjdF0KICAgICAgICBFIC0tPnxJTlNFUlR8IEdbcGFyc2VfaW5zZXJ0XQogICAgICAgIEUgLS0+fFVQREFURXwgSFtwYXJzZV91cGRhdGVdCiAgICAgICAgRSAtLT58REVMRVRFfCBJW3BhcnNlX2RlbGV0ZV0KICAgICAgICBFIC0tPnxDUkVBVEV8IEpbcGFyc2VfY3JlYXRlXQogICAgICAgIAogICAgICAgIEYgLS0+IEtbcGFyc2VfZXhwcmVzc2lvbl0KICAgICAgICBLIC0tPiBMW1ByYXR0IFBhcnNpbmddCiAgICAgICAgTCAtLT4gTVtFeHByZXNzaW9uIEFTVF0KICAgIGVuZAogICAgCiAgICBzdWJncmFwaCAiQVNUIgogICAgICAgIE0gLS0+IE5bU2VsZWN0U3RhdGVtZW50XQogICAgICAgIE4gLS0+IE9bUHJvamVjdGlvbnNdCiAgICAgICAgTiAtLT4gUFtGcm9tL0pvaW5zXQogICAgICAgIE4gLS0+IFFbV2hlcmVdCiAgICAgICAgTiAtLT4gUltHcm91cCBCeS9IYXZpbmddCiAgICAgICAgTiAtLT4gU1tPcmRlciBCeS9MaW1pdF0KICAgIGVuZAogICAgCiAgICBzdHlsZSBBIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgQyBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIE4gZmlsbDojZmZmM2UwLHN0cm9rZTojZjU3YzAw</code></pre><p><strong>Key Takeaways:</strong></p><table><thead><tr><th>Concept</th><th>Why It Matters</th></tr></thead><tbody><tr><td><strong>Lexer</strong></td><td>Tokenize SQL into meaningful units</td></tr><tr><td><strong>Recursive descent</strong></td><td>Top-down parsing, one function per grammar rule</td></tr><tr><td><strong>Pratt parsing</strong></td><td>Handle operator precedence correctly</td></tr><tr><td><strong>AST design</strong></td><td>Structured representation for query planning</td></tr><tr><td><strong>Error recovery</strong></td><td>Continue parsing after errors for better messages</td></tr><tr><td><strong>Box for recursion</strong></td><td>Rust needs known-size types</td></tr></tbody></table><hr /><p><strong>Further Reading:</strong></p><ul><li>“Crafting Interpreters” by Robert Nystrom (free online) - Excellent parser tutorial</li><li>“Programming Language Pragmatics” by Scott - Compiler design fundamentals</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/tree/master/src/backend/parser"><code>src/backend/parser/</code></a></li><li>sqlparser-rs: <a href="https://github.com/sqlparser-rs/sqlparser-rs">github.com/sqlparser-rs/sqlparser-rs</a></li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Part 6 of the Vaultgres journey: building a SQL parser from scratch. Deep dive into lexing, recursive descent parsing, AST design for DDL/DML/queries, and expression precedence handling.</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 构建 PostgreSQL 兼容数据库：通信协议与结果集序列化</title>
    <link href="https://neo01.com/zh-CN/2026/03/Database-Rust-Wire-Protocol-Result-Set/"/>
    <id>https://neo01.com/zh-CN/2026/03/Database-Rust-Wire-Protocol-Result-Set/</id>
    <published>2026-03-04T16:00:00.000Z</published>
    <updated>2026-03-14T06:22:01.247Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-CN/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/">第四部分</a> 中，我们构建了 WAL 和崩溃恢复。我们的数据库现在可以在停电中存活。但有个问题。</p><p><strong>客户端实际上如何与我们的数据库对话？</strong></p><pre class="language-none"><code class="language-none">┌─────────────┐                          ┌─────────────┐│   psql      │                          │  Vaultgres  ││   client    │                          │   server    ││             │     ??? How to talk ???  │             │└─────────────┘                          └─────────────┘</code></pre><p>我们可以发明自己的协议。但那样我们就必须从头构建客户端。</p><p><strong>更好的方法：</strong> 说 PostgreSQL 的通信协议。然后 <code>psql</code>、JDBC、libpq——所有现有工具——都能直接用。</p><p>今天：在 Rust 中实现 PostgreSQL 通信协议，从启动握手到结果集序列化。</p><hr /><h2 id="1-通信协议概述">1 通信协议概述</h2><h3 id="Frontend-Backend-模型">Frontend/Backend 模型</h3><p>PostgreSQL 使用 <strong>frontend/backend</strong> 架构：</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    PostgreSQL Protocol                       │├─────────────────────────────────────────────────────────────┤│                                                              ││  Frontend (Client)          Backend (Server)                ││  - psql                     - Vaultgres                     ││  - libpq (C driver)         - Query processor               ││  - JDBC&#x2F;ODBC              - Storage engine                 ││  - psycopg (Python)         - Transaction manager           ││                                                              ││  Communication: TCP&#x2F;IP (usually port 5432)                  ││  Message format: Length-prefixed binary protocol            ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="消息结构">消息结构</h3><p>每个消息都有相同的格式：</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ Message Format                                              │├─────────────────────────────────────────────────────────────┤│ ┌─────────────┬─────────────────────────────────────────┐   ││ │ Type (1B)   │ Length (4B, includes itself)            │   ││ ├─────────────┴─────────────────────────────────────────┤   ││ │ Payload (variable)                                     │   ││ └─────────────────────────────────────────────────────────┘   │└─────────────────────────────────────────────────────────────┘Example: SimpleQuery (&#39;Q&#39;)┌─────────────────────────────────────────────────────────────┐│ &#39;Q&#39; │ 0x00 0x00 0x00 0x1A │ &quot;SELECT * FROM users\0&quot;        ││  1B │      4B (26 bytes)   │ variable (null-terminated)     │└─────────────────────────────────────────────────────────────┘</code></pre><p><strong>关键洞察：</strong> 长度是<strong>大端序</strong>（网络字节顺序）且<strong>包含自身</strong>（不包含类型字节）。</p><hr /><h3 id="消息类型">消息类型</h3><table><thead><tr><th>类型</th><th>代码</th><th>方向</th><th>目的</th></tr></thead><tbody><tr><td><strong>StartupMessage</strong></td><td>(none)</td><td>F→B</td><td>初始连接（无类型字节）</td></tr><tr><td><strong>AuthenticationOk</strong></td><td>‘R’</td><td>B→F</td><td>登录成功</td></tr><tr><td><strong>Query</strong></td><td>‘Q’</td><td>F→B</td><td>简单查询（SQL 字符串）</td></tr><tr><td><strong>RowDescription</strong></td><td>‘T’</td><td>B→F</td><td>字段元数据</td></tr><tr><td><strong>DataRow</strong></td><td>‘D’</td><td>B→F</td><td>实际行数据</td></tr><tr><td><strong>CommandComplete</strong></td><td>‘C’</td><td>B→F</td><td>查询完成</td></tr><tr><td><strong>ReadyForQuery</strong></td><td>‘Z’</td><td>B→F</td><td>服务器准备好下一个查询</td></tr><tr><td><strong>ErrorResponse</strong></td><td>‘E’</td><td>B→F</td><td>出错了</td></tr><tr><td><strong>Parse</strong></td><td>‘P’</td><td>F→B</td><td>扩展查询：准备</td></tr><tr><td><strong>Bind</strong></td><td>‘B’</td><td>F→B</td><td>扩展查询：绑定参数</td></tr><tr><td><strong>Execute</strong></td><td>‘E’</td><td>F→B</td><td>扩展查询：执行</td></tr><tr><td><strong>Sync</strong></td><td>‘S’</td><td>F→B</td><td>扩展查询：完成批次</td></tr></tbody></table><p>F→B = Frontend to Backend, B→F = Backend to Frontend</p><hr /><h2 id="2-连接启动">2 连接启动</h2><h3 id="握手流程">握手流程</h3><pre class="language-MERMAID_BASE64_618" data-language="MERMAID_BASE64_618"><code class="language-MERMAID_BASE64_618">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBDbGllbnQKICAgIHBhcnRpY2lwYW50IFNlcnZlcgoKICAgIENsaWVudC0+PlNlcnZlcjogU3RhcnR1cE1lc3NhZ2UgKHVzZXIsIGRhdGFiYXNlLCBvcHRpb25zKQogICAgU2VydmVyLT4+Q2xpZW50OiBBdXRoZW50aWNhdGlvbk9rCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFBhcmFtZXRlclN0YXR1cyAoc2VydmVyX3ZlcnNpb24sIGVuY29kaW5nLCAuLi4pCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFJlYWR5Rm9yUXVlcnkgKGlkbGUpCiAgICAKICAgIENsaWVudC0+PlNlcnZlcjogUXVlcnkgLyBFeHRlbmRlZCBRdWVyeQogICAgU2VydmVyLT4+Q2xpZW50OiBSb3dEZXNjcmlwdGlvbiAoZm9yIFNFTEVDVCkKICAgIFNlcnZlci0+PkNsaWVudDogRGF0YVJvdyDDlyBOCiAgICBTZXJ2ZXItPj5DbGllbnQ6IENvbW1hbmRDb21wbGV0ZQogICAgU2VydmVyLT4+Q2xpZW50OiBSZWFkeUZvclF1ZXJ5IChpZGxlKQ&#x3D;&#x3D;</code></pre><hr /><h3 id="StartupMessage">StartupMessage</h3><p>第一个消息很特殊——<strong>没有类型字节</strong>，只有长度：</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ StartupMessage                                              │├─────────────────────────────────────────────────────────────┤│ Length (4B): 8 + parameters                                 ││ Protocol Version (4B): 196608 (3.0)                         ││ Parameters (null-terminated key&#x3D;value pairs):               ││   &quot;user\0neo\0database\0vaultgres\0\0&quot;                      │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/startup.rs</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>io<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">AsyncReadExt</span><span class="token punctuation">,</span> <span class="token class-name">AsyncWriteExt</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>net<span class="token punctuation">::</span></span><span class="token class-name">TcpStream</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StartupMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> user<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> database<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StartupMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Read length (4 bytes, big-endian)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> len_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> len_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> len <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>len_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Read protocol version</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> version_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> version_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> version <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>version_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> version <span class="token operator">!=</span> <span class="token number">196608</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">::</span><span class="token class-name">UnsupportedVersion</span><span class="token punctuation">(</span>version<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Read parameters (null-terminated key=value pairs)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> params <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> remaining <span class="token operator">=</span> len <span class="token operator">-</span> <span class="token number">8</span><span class="token punctuation">;</span>  <span class="token comment">// Subtract length and version bytes</span>        <span class="token keyword">while</span> remaining <span class="token operator">></span> <span class="token number">1</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> key <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> byte <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>                        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> byte<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                remaining <span class="token operator">-=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>                key<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">if</span> key<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>  <span class="token comment">// Empty key = end of parameters</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> value <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> byte<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                remaining <span class="token operator">-=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>                value<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> key <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">from_utf8</span><span class="token punctuation">(</span>key<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> value <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">from_utf8</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            params<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>key<span class="token punctuation">,</span> value<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            user<span class="token punctuation">:</span> params<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token string">"user"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            database<span class="token punctuation">:</span> params<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token string">"database"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            options<span class="token punctuation">:</span> params<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Authentication-和-ParameterStatus">Authentication 和 ParameterStatus</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/messages.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">authentication_ok</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'R' (1B) + Length (4B) + Auth Type (4B = 0 for Ok)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'R'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">12u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Length</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">0u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>   <span class="token comment">// AuthOk</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parameter_status</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'S' (1B) + Length (4B) + name\0 + value\0</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'S'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> payload_len <span class="token operator">=</span> <span class="token number">4</span> <span class="token operator">+</span> name<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span> <span class="token operator">+</span> value<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>payload_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>name<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">ready_for_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> status<span class="token punctuation">:</span> <span class="token class-name">TransactionStatus</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'Z' (1B) + Length (4B) + Status (1B)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'Z'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">5u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>status <span class="token keyword">as</span> <span class="token keyword">u8</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy)]</span><span class="token attribute attr-name">#[repr(u8)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TransactionStatus</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Idle</span> <span class="token operator">=</span> <span class="token char">b'I'</span><span class="token punctuation">,</span>    <span class="token class-name">InTransaction</span> <span class="token operator">=</span> <span class="token char">b'T'</span><span class="token punctuation">,</span>    <span class="token class-name">InFailedTransaction</span> <span class="token operator">=</span> <span class="token char">b'E'</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>服务器发送这些参数：</strong></p><table><thead><tr><th>参数</th><th>值</th><th>目的</th></tr></thead><tbody><tr><td><code>server_version</code></td><td><code>16.0</code></td><td>我们模拟的 PostgreSQL 版本</td></tr><tr><td><code>server_encoding</code></td><td><code>UTF8</code></td><td>字符编码</td></tr><tr><td><code>client_encoding</code></td><td><code>UTF8</code></td><td>客户端的编码</td></tr><tr><td><code>integer_datetimes</code></td><td><code>on</code></td><td>64 位整数时间戳</td></tr></tbody></table><hr /><h2 id="3-简单查询协议">3 简单查询协议</h2><h3 id="查询流程">查询流程</h3><pre class="language-none"><code class="language-none">Client: Query(&quot;SELECT id, name FROM users WHERE id &#x3D; 1&quot;)Server: RowDescription (column metadata)Server: DataRow (row 1)Server: DataRow (row 2)...Server: CommandComplete (&quot;SELECT 2&quot;)Server: ReadyForQuery (&#39;I&#39;)</code></pre><hr /><h3 id="RowDescription：告诉客户端关于字段">RowDescription：告诉客户端关于字段</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/row_description.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">FieldDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> table_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> column_attr_num<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_size<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_modifier<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> format_code<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>  <span class="token comment">// 0 = text, 1 = binary</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">RowDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> fields<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FieldDescription</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">RowDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'T' (1B) + Length (4B) + Num Fields (2B) + Fields...</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'T'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Calculate payload length</span>        <span class="token keyword">let</span> payload_len <span class="token operator">=</span> <span class="token number">2</span> <span class="token operator">+</span> <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token number">19</span><span class="token punctuation">)</span> <span class="token operator">+</span>             <span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>f<span class="token closure-punctuation punctuation">|</span></span> f<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>payload_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i16</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> field <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields <span class="token punctuation">&#123;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>field<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Null terminator</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>table_oid<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>column_attr_num<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_oid<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_size<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_modifier<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>format_code<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>范例输出：</strong></p><pre class="language-none"><code class="language-none">SELECT id, name FROM usersRowDescription:┌─────────────────────────────────────────────────────────────┐│ &#39;T&#39; │ Length │ 2 fields                                     │├─────────────────────────────────────────────────────────────┤│ Field 1: &quot;id&quot;                                               ││   table_oid: 16384                                          ││   column_attr_num: 1                                        ││   type_oid: 23 (INT4)                                       ││   type_size: 4                                              ││   type_modifier: -1                                         ││   format_code: 0 (text)                                     │├─────────────────────────────────────────────────────────────┤│ Field 2: &quot;name&quot;                                             ││   table_oid: 16384                                          ││   column_attr_num: 2                                        ││   type_oid: 25 (TEXT)                                       ││   type_size: -1 (variable)                                  ││   type_modifier: -1                                         ││   format_code: 0 (text)                                     │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="DataRow：序列化实际行">DataRow：序列化实际行</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/data_row.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DataRow</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// None = NULL</span>    <span class="token keyword">pub</span> format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">DataRow</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'D' (1B) + Length (4B) + Num Values (2B) + Values...</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Calculate payload length</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> payload_len <span class="token operator">=</span> <span class="token number">2u32</span><span class="token punctuation">;</span>  <span class="token comment">// Num values</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            payload_len <span class="token operator">+=</span> <span class="token number">4</span><span class="token punctuation">;</span>  <span class="token comment">// Length prefix</span>            <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=</span> value <span class="token punctuation">&#123;</span>                payload_len <span class="token operator">+=</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>payload_len<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i16</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            <span class="token keyword">match</span> value <span class="token punctuation">&#123;</span>                <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// NULL: length = -1</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Non-NULL: length + data</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>范例：</strong></p><pre class="language-none"><code class="language-none">Row: id&#x3D;1, name&#x3D;&quot;Alice&quot;, email&#x3D;NULLDataRow:┌─────────────────────────────────────────────────────────────┐│ &#39;D&#39; │ Length │ 3 values                                     │├─────────────────────────────────────────────────────────────┤│ Value 1: 4 bytes │ &quot;1&quot;                                      ││ Value 2: 5 bytes │ &quot;Alice&quot;                                  ││ Value 3: -1 (NULL)                                          │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="文本-vs-二进制格式">文本 vs. 二进制格式</h3><p><strong>文本格式（format_code = 0）：</strong> 可读字符串</p><pre class="language-none"><code class="language-none">INT4: &quot;42&quot;TEXT: &quot;Alice&quot;TIMESTAMP: &quot;2026-03-29 14:30:00.123456+00&quot;</code></pre><p><strong>二进制格式（format_code = 1）：</strong> 原生表示</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/type_encoding.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_int4</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">into_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Text</span>        <span class="token number">1</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token comment">// Binary</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_text</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>       <span class="token comment">// Text (UTF-8)</span>        <span class="token number">1</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Binary: 4-byte length prefix + data</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> buf <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_timestamp</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">format</span><span class="token punctuation">(</span><span class="token string">"%Y-%m-%d %H:%M:%S%.6f%z"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">into_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token number">1</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// PostgreSQL epoch: 2000-01-01 00:00:00 UTC</span>            <span class="token keyword">let</span> epoch <span class="token operator">=</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token punctuation">::</span><span class="token function">from_timestamp</span><span class="token punctuation">(</span><span class="token number">946684800</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> micros <span class="token operator">=</span> value<span class="token punctuation">.</span><span class="token function">signed_duration_since</span><span class="token punctuation">(</span>epoch<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">num_microseconds</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            micros<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-扩展查询协议">4 扩展查询协议</h2><h3 id="为什么需要扩展查询？">为什么需要扩展查询？</h3><p><strong>简单查询：</strong> SQL 注入风险，无预备语句</p><pre class="language-none"><code class="language-none">Client: Query(&quot;SELECT * FROM users WHERE id &#x3D; &quot; + user_input)→ SQL injection vulnerability!</code></pre><p><strong>扩展查询：</strong> 预备语句，参数绑定</p><pre class="language-none"><code class="language-none">Client: Parse(&quot;SELECT * FROM users WHERE id &#x3D; $1&quot;)Client: Bind([42])Client: Execute()→ Safe from SQL injection!</code></pre><hr /><h3 id="扩展查询流程">扩展查询流程</h3><pre class="language-MERMAID_BASE64_619" data-language="MERMAID_BASE64_619"><code class="language-MERMAID_BASE64_619">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBDbGllbnQKICAgIHBhcnRpY2lwYW50IFNlcnZlcgoKICAgIENsaWVudC0+PlNlcnZlcjogUGFyc2UgKFNRTCwgcGFyYW1ldGVyIHR5cGVzKQogICAgU2VydmVyLT4+Q2xpZW50OiBQYXJzZUNvbXBsZXRlCgogICAgQ2xpZW50LT4+U2VydmVyOiBCaW5kIChwYXJhbWV0ZXIgdmFsdWVzKQogICAgU2VydmVyLT4+Q2xpZW50OiBCaW5kQ29tcGxldGUKCiAgICBsb29wIE11bHRpcGxlIGV4ZWN1dGlvbnMKICAgICAgICBDbGllbnQtPj5TZXJ2ZXI6IEV4ZWN1dGUgKG1heF9yb3dzKQogICAgICAgIFNlcnZlci0+PkNsaWVudDogRGF0YVJvdyDDlyBOCiAgICBlbmQKCiAgICBDbGllbnQtPj5TZXJ2ZXI6IFN5bmMKICAgIFNlcnZlci0+PkNsaWVudDogQ29tbWFuZENvbXBsZXRlCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFJlYWR5Rm9yUXVlcnk&#x3D;</code></pre><hr /><h3 id="Parse：准备语句">Parse：准备语句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/parse.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ParseMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> statement_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> query<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_types<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// OID for each parameter</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ParseMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// statement_name (null-terminated)</span>        <span class="token keyword">let</span> statement_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// query (null-terminated)</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// num_parameter_types (2B)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> num_types_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> num_types_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> num_types <span class="token operator">=</span> <span class="token keyword">i16</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>num_types_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_types (4B each)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_types <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_types <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> type_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> type_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            parameter_types<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>type_buf<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            statement_name<span class="token punctuation">,</span>            query<span class="token punctuation">,</span>            parameter_types<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parse_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// '1' (1B) + Length (4B = 4)</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'1'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">4u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Bind：创建-Portal">Bind：创建 Portal</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/bind.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">BindMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> portal_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> statement_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> result_format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">BindMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// portal_name (null-terminated)</span>        <span class="token keyword">let</span> portal_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// statement_name (null-terminated)</span>        <span class="token keyword">let</span> statement_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// num_parameter_format_codes (2B)</span>        <span class="token keyword">let</span> num_formats <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_format_codes</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_format_codes <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_formats <span class="token punctuation">&#123;</span>            parameter_format_codes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// num_parameter_values (2B)</span>        <span class="token keyword">let</span> num_values <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_values <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_values <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> len <span class="token operator">=</span> <span class="token function">read_i32</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> len <span class="token operator">==</span> <span class="token operator">-</span><span class="token number">1</span> <span class="token punctuation">&#123;</span>                parameter_values<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// NULL</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> data <span class="token operator">=</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> len <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> data<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                parameter_values<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// num_result_format_codes (2B)</span>        <span class="token keyword">let</span> num_result_formats <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// result_format_codes</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> result_format_codes <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_result_formats <span class="token punctuation">&#123;</span>            result_format_codes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            portal_name<span class="token punctuation">,</span>            statement_name<span class="token punctuation">,</span>            parameter_format_codes<span class="token punctuation">,</span>            parameter_values<span class="token punctuation">,</span>            result_format_codes<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">bind_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// '2' (1B) + Length (4B = 4)</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'2'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">4u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Execute：执行预备语句">Execute：执行预备语句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/execute.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ExecuteMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> portal_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_rows<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span>  <span class="token comment">// 0 = all rows</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ExecuteMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> portal_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> max_rows <span class="token operator">=</span> <span class="token function">read_i32</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> portal_name<span class="token punctuation">,</span> max_rows <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>服务器响应：</strong> DataRow 消息（没有特定的 “ExecuteComplete” 消息）</p><hr /><h3 id="Sync：完成批次">Sync：完成批次</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/sync.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SyncMessage</span><span class="token punctuation">;</span><span class="token keyword">impl</span> <span class="token class-name">SyncMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>_stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Sync has no body, just the message header</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">SyncMessage</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">sync_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">,</span> status<span class="token punctuation">:</span> <span class="token class-name">TransactionStatus</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// CommandComplete + ReadyForQuery</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// CommandComplete: 'C' + Length + "SELECT 2\0"</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'C'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> cmd <span class="token operator">=</span> <span class="token string">b"SELECT 2"</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token punctuation">(</span>cmd<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>cmd<span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-完整的查询执行流程">5 完整的查询执行流程</h2><h3 id="整合在一起">整合在一起</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/handler.rs</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>net<span class="token punctuation">::</span></span><span class="token class-name">TcpStream</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>query_executor<span class="token punctuation">::</span></span><span class="token class-name">QueryExecutor</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>storage<span class="token punctuation">::</span>buffer_pool<span class="token punctuation">::</span></span><span class="token class-name">BufferPool</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    stream<span class="token punctuation">:</span> <span class="token class-name">TcpStream</span><span class="token punctuation">,</span>    executor<span class="token punctuation">:</span> <span class="token class-name">QueryExecutor</span><span class="token punctuation">,</span>    builder<span class="token punctuation">:</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">,</span>    prepared_statements<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">PreparedStatement</span><span class="token operator">></span><span class="token punctuation">,</span>    portals<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Portal</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_connection</span><span class="token punctuation">(</span><span class="token keyword">mut</span> stream<span class="token punctuation">:</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Read startup message</span>        <span class="token keyword">let</span> startup <span class="token operator">=</span> <span class="token class-name">StartupMessage</span><span class="token punctuation">::</span><span class="token function">read_from</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 2. Send authentication</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">authentication_ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 3. Send parameter status</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_version"</span><span class="token punctuation">,</span> <span class="token string">"16.0"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"client_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 4. Send ready for query</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 5. Main message loop</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> type_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> type_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                        <span class="token keyword">match</span> type_buf<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token keyword">as</span> <span class="token keyword">char</span> <span class="token punctuation">&#123;</span>                <span class="token char">'Q'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_simple_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'P'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_parse</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'B'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_bind</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'E'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'S'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_sync</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'X'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Terminate</span>                    <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                _ <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">::</span><span class="token class-name">UnknownMessage</span><span class="token punctuation">(</span>type_buf<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_simple_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Read query string</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Execute query</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Send RowDescription (if SELECT)</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>columns<span class="token punctuation">)</span> <span class="token operator">=</span> result<span class="token punctuation">.</span>columns <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> row_desc <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_row_description</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>columns<span class="token punctuation">)</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>row_desc<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                        <span class="token comment">// Send DataRows</span>            <span class="token keyword">for</span> row <span class="token keyword">in</span> result<span class="token punctuation">.</span>rows <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> data_row <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_data_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>row<span class="token punctuation">)</span><span class="token punctuation">;</span>                stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>data_row<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Send CommandComplete</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">command_complete</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>result<span class="token punctuation">.</span>command_tag<span class="token punctuation">)</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Send ReadyForQuery</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="结果集序列化范例">结果集序列化范例</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/result_set.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Column</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> rows<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Row</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> command_tag<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Column</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_size<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Row</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">>></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">send_to</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// RowDescription</span>        <span class="token keyword">let</span> fields<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FieldDescription</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>col<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>            <span class="token class-name">FieldDescription</span> <span class="token punctuation">&#123;</span>                name<span class="token punctuation">:</span> col<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                table_oid<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                column_attr_num<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                type_oid<span class="token punctuation">:</span> col<span class="token punctuation">.</span>type_oid<span class="token punctuation">,</span>                type_size<span class="token punctuation">:</span> col<span class="token punctuation">.</span>type_size<span class="token punctuation">,</span>                type_modifier<span class="token punctuation">:</span> <span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">,</span>                format_code<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Text format</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> row_desc <span class="token operator">=</span> <span class="token class-name">RowDescription</span> <span class="token punctuation">&#123;</span> fields <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>row_desc<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// DataRows</span>        <span class="token keyword">for</span> row <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>rows <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span> <span class="token operator">=</span> row<span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>s<span class="token closure-punctuation punctuation">|</span></span> s<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> data_row <span class="token operator">=</span> <span class="token class-name">DataRow</span> <span class="token punctuation">&#123;</span>                values<span class="token punctuation">,</span>                format_codes<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>data_row<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// CommandComplete</span>        builder<span class="token punctuation">.</span><span class="token function">command_complete</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>command_tag<span class="token punctuation">)</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage example</span><span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span>        <span class="token class-name">Column</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"id"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> type_oid<span class="token punctuation">:</span> <span class="token number">23</span><span class="token punctuation">,</span> type_size<span class="token punctuation">:</span> <span class="token number">4</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token class-name">Column</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"name"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> type_oid<span class="token punctuation">:</span> <span class="token number">25</span><span class="token punctuation">,</span> type_size<span class="token punctuation">:</span> <span class="token operator">-</span><span class="token number">1</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    rows<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span>        <span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> values<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"1"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"Alice"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">]</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> values<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"2"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"Bob"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">]</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    command_tag<span class="token punctuation">:</span> <span class="token string">"SELECT 2"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span>result<span class="token punctuation">.</span><span class="token function">send_to</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> builder<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span></code></pre><p><strong>psql 接收的内容：</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ T (RowDescription)                                          ││   2 columns: id (INT4), name (TEXT)                         │├─────────────────────────────────────────────────────────────┤│ D (DataRow)                                                 ││   id&#x3D;1, name&#x3D;&quot;Alice&quot;                                        │├─────────────────────────────────────────────────────────────┤│ D (DataRow)                                                 ││   id&#x3D;2, name&#x3D;&quot;Bob&quot;                                          │├─────────────────────────────────────────────────────────────┤│ C (CommandComplete)                                         ││   &quot;SELECT 2&quot;                                                │├─────────────────────────────────────────────────────────────┤│ Z (ReadyForQuery)                                           ││   Status: Idle                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="6-PostgreSQL-类型-OID">6 PostgreSQL 类型 OID</h2><h3 id="常见类型">常见类型</h3><table><thead><tr><th>类型名称</th><th>OID</th><th>大小</th><th>说明</th></tr></thead><tbody><tr><td><code>BOOL</code></td><td>16</td><td>1</td><td>布尔值</td></tr><tr><td><code>INT2</code> (SMALLINT)</td><td>21</td><td>2</td><td>2 字节整数</td></tr><tr><td><code>INT4</code> (INTEGER)</td><td>23</td><td>4</td><td>4 字节整数</td></tr><tr><td><code>INT8</code> (BIGINT)</td><td>20</td><td>8</td><td>8 字节整数</td></tr><tr><td><code>TEXT</code></td><td>25</td><td>-1</td><td>可变长度文本</td></tr><tr><td><code>VARCHAR</code></td><td>1043</td><td>-1</td><td>可变长度字符</td></tr><tr><td><code>TIMESTAMP</code></td><td>1114</td><td>8</td><td>无时区时间戳</td></tr><tr><td><code>TIMESTAMPTZ</code></td><td>1184</td><td>8</td><td>有时区时间戳</td></tr><tr><td><code>FLOAT4</code> (REAL)</td><td>700</td><td>4</td><td>4 字节浮点数</td></tr><tr><td><code>FLOAT8</code> (DOUBLE)</td><td>701</td><td>8</td><td>8 字节浮点数</td></tr><tr><td><code>NUMERIC</code></td><td>1700</td><td>-1</td><td>任意精度</td></tr><tr><td><code>BYTEA</code></td><td>17</td><td>-1</td><td>二进制数据</td></tr><tr><td><code>OID</code></td><td>26</td><td>4</td><td>对象标识符</td></tr></tbody></table><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/oids.rs</span><span class="token keyword">pub</span> <span class="token keyword">mod</span> <span class="token module-declaration namespace">oid</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">BOOL</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">16</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT2</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">21</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT4</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">23</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT8</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">20</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TEXT</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">25</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">VARCHAR</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1043</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TIMESTAMP</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1114</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TIMESTAMPTZ</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1184</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FLOAT4</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">700</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FLOAT8</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">701</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">NUMERIC</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1700</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">BYTEA</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">17</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">OID</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">26</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-构建的挑战">7 用 Rust 构建的挑战</h2><h3 id="挑战-1：异步-I-O-和借用">挑战 1：异步 I/O 和借用</h3><p><strong>问题：</strong> tokio 需要 <code>&amp;mut self</code> 进行异步 I/O，但我们需要从 self 借用。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Borrows self</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Also borrows self!</span>        <span class="token comment">// Error: cannot borrow as mutable more than once</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：重构以避免同时借用</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Release borrow before next operation</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">send_result</span><span class="token punctuation">(</span>result<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-2：零拷贝-vs-分配">挑战 2：零拷贝 vs. 分配</h3><p><strong>问题：</strong> 通信协议消息需要序列化。拷贝很昂贵。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Allocates on every message</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> buffer <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ... lots of allocations ...</span>    buffer<span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：重用缓冲区</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Reuses allocated buffer</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Pre-allocated, reused</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">with_capacity</span><span class="token punctuation">(</span>capacity<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">with_capacity</span><span class="token punctuation">(</span>capacity<span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">data_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> row<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Row</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Reuse capacity</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// ... write to buffer ...</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer  <span class="token comment">// Return reference, not owned</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-3：跨层错误处理">挑战 3：跨层错误处理</h3><p><strong>问题：</strong> 通信协议错误、查询错误、存储错误——所有不同的类型。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Error type explosion</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Error</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Io</span><span class="token punctuation">(</span><span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Protocol</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Query</span><span class="token punctuation">(</span><span class="token class-name">QueryError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Storage</span><span class="token punctuation">(</span><span class="token class-name">StorageError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Transaction</span><span class="token punctuation">(</span><span class="token class-name">TransactionError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token comment">// ... 20 more variants ...</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：使用 <code>thiserror</code> 和转换特性</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean error handling</span><span class="token attribute attr-name">#[derive(Debug, thiserror::Error)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ProtocolError</span> <span class="token punctuation">&#123;</span>    <span class="token attribute attr-name">#[error(<span class="token string">"IO error: &#123;0&#125;"</span>)]</span>    <span class="token class-name">Io</span><span class="token punctuation">(</span><span class="token attribute attr-name">#[from]</span> <span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token attribute attr-name">#[error(<span class="token string">"Invalid message type: &#123;0&#125;"</span>)]</span>    <span class="token class-name">UnknownMessage</span><span class="token punctuation">(</span><span class="token keyword">u8</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token attribute attr-name">#[error(<span class="token string">"Query error: &#123;0&#125;"</span>)]</span>    <span class="token class-name">Query</span><span class="token punctuation">(</span><span class="token attribute attr-name">#[from]</span> <span class="token class-name">QueryError</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// Use ? operator for automatic conversion</span><span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// io::Error → ProtocolError</span>    <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// QueryError → ProtocolError</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-AI-如何加速这项工作">8 AI 如何加速这项工作</h2><h3 id="AI-做对了什么">AI 做对了什么</h3><table><thead><tr><th>任务</th><th>AI 贡献</th></tr></thead><tbody><tr><td><strong>消息格式</strong></td><td>正确的大端序编码</td></tr><tr><td><strong>扩展查询流程</strong></td><td>Parse → Bind → Execute 顺序</td></tr><tr><td><strong>类型 OID</strong></td><td>准确的 PostgreSQL 类型 OID</td></tr><tr><td><strong>NULL 处理</strong></td><td>NULL 的 -1 长度前缀</td></tr></tbody></table><hr /><h3 id="AI-做错了什么">AI 做错了什么</h3><table><thead><tr><th>问题</th><th>发生什么事</th></tr></thead><tbody><tr><td><strong>长度计算</strong></td><td>初稿没有在长度中包含长度字节</td></tr><tr><td><strong>启动消息</strong></td><td>尝试添加类型字节（启动没有！）</td></tr><tr><td><strong>二进制格式</strong></td><td>建议小端序（PostgreSQL 使用大端序）</td></tr><tr><td><strong>Portal 生命周期</strong></td><td>忽略了 portal 在 Execute 后被销毁</td></tr></tbody></table><p><strong>模式：</strong> 通信协议很精确。差一错误会破坏一切。</p><hr /><h3 id="范例：调试-psql-连接">范例：调试 psql 连接</h3><p><strong>我问 AI 的问题：</strong></p><blockquote><p>“psql 连接但立即断开。什么错了？”</p></blockquote><p><strong>我学到的：</strong></p><ol><li>psql 期望特定的 ParameterStatus 消息</li><li>缺少 <code>server_version</code> 会导致无声断开</li><li>ReadyForQuery 必须在验证后发送</li></ol><p><strong>结果：</strong> 添加了必需的参数：</p><pre class="language-rust" data-language="rust"><code class="language-rust">stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_version"</span><span class="token punctuation">,</span> <span class="token string">"16.0"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"client_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span></code></pre><p><strong>现在 psql 连接成功！</strong></p><pre class="language-none"><code class="language-none">$ psql -h localhost -p 5432 -U neo vaultgrespsql (16.0, server 16.0 (Vaultgres))Type &quot;help&quot; for help.vaultgres&#x3D;&gt; SELECT 1; ?column? ----------        1(1 row)</code></pre><hr /><h2 id="总结：通信协议一张图">总结：通信协议一张图</h2><pre class="language-MERMAID_BASE64_620" data-language="MERMAID_BASE64_620"><code class="language-MERMAID_BASE64_620">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiQ29ubmVjdGlvbiBTdGFydHVwIgogICAgICAgIEFbQ2xpZW50IGNvbm5lY3RzXSAtLT4gQltTdGFydHVwTWVzc2FnZV0KICAgICAgICBCIC0tPiBDW0F1dGhlbnRpY2F0aW9uT2tdCiAgICAgICAgQyAtLT4gRFtQYXJhbWV0ZXJTdGF0dXNdCiAgICAgICAgRCAtLT4gRVtSZWFkeUZvclF1ZXJ5XQogICAgZW5kCgogICAgc3ViZ3JhcGggIlNpbXBsZSBRdWVyeSIKICAgICAgICBGW1F1ZXJ5ICdRJ10gLS0+IEdbUm93RGVzY3JpcHRpb24gJ1QnXQogICAgICAgIEcgLS0+IEhbRGF0YVJvdyAnRCcgw5cgTl0KICAgICAgICBIIC0tPiBJW0NvbW1hbmRDb21wbGV0ZSAnQyddCiAgICAgICAgSSAtLT4gRQogICAgZW5kCgogICAgc3ViZ3JhcGggIkV4dGVuZGVkIFF1ZXJ5IgogICAgICAgIEpbUGFyc2UgJ1AnXSAtLT4gS1tQYXJzZUNvbXBsZXRlICcxJ10KICAgICAgICBLIC0tPiBMW0JpbmQgJ0InXQogICAgICAgIEwgLS0+IE1bQmluZENvbXBsZXRlICcyJ10KICAgICAgICBNIC0tPiBOW0V4ZWN1dGUgJ0UnXQogICAgICAgIE4gLS0+IEgKICAgICAgICBPW1N5bmMgJ1MnXSAtLT4gSQogICAgZW5kCgogICAgc3ViZ3JhcGggIk1lc3NhZ2UgRm9ybWF0IgogICAgICAgIFBbVHlwZSAxQl0gLS0+IFFbTGVuZ3RoIDRCIEJFXQogICAgICAgIFEgLS0+IFJbUGF5bG9hZF0KICAgIGVuZAoKICAgIHN1YmdyYXBoICJSZXN1bHQgU2VyaWFsaXphdGlvbiIKICAgICAgICBTW1JvdzogaWQ9MSwgbmFtZT0nQWxpY2UnXSAtLT4gVFtEYXRhUm93OiAnRCcgKyBsZW4gKyB2YWx1ZXNdCiAgICBlbmQKCiAgICBzdHlsZSBCIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgRiBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEogZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNjCiAgICBzdHlsZSBQIGZpbGw6I2ZmZjNlMCxzdHJva2U6I2Y1N2MwMA&#x3D;&#x3D;</code></pre><p><strong>关键要点：</strong></p><table><thead><tr><th>概念</th><th>为什么重要</th></tr></thead><tbody><tr><td><strong>通信协议</strong></td><td>与现有 PostgreSQL 工具兼容</td></tr><tr><td><strong>消息框架</strong></td><td>长度前缀二进制协议</td></tr><tr><td><strong>简单 vs. 扩展</strong></td><td>快速查询 vs. 预备语句</td></tr><tr><td><strong>RowDescription</strong></td><td>客户端的字段元数据</td></tr><tr><td><strong>DataRow</strong></td><td>实际行数据（文本或二进制）</td></tr><tr><td><strong>类型 OID</strong></td><td>PostgreSQL 类型识别</td></tr><tr><td><strong>NULL 编码</strong></td><td>-1 长度前缀</td></tr></tbody></table><hr /><p><strong>进一步阅读：</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c"><code>src/backend/tcop/postgres.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/include/libpq/pqformat.h"><code>src/include/libpq/pqformat.h</code></a></li><li>“PostgreSQL Wire Protocol” documentation: <a href="https://www.postgresql.org/docs/current/protocol.html">https://www.postgresql.org/docs/current/protocol.html</a></li><li>libpq source: <a href="https://github.com/postgres/postgres/tree/master/src/interfaces/libpq"><code>src/interfaces/libpq/</code></a></li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第五部分：实现 PostgreSQL 通信协议。深入探讨消息框架、启动握手、扩展查询协议，以及序列化 psql 和驱动程序能理解的结果集。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 建構 PostgreSQL 相容資料庫：通訊協定與結果集序列化</title>
    <link href="https://neo01.com/zh-TW/2026/03/Database-Rust-Wire-Protocol-Result-Set/"/>
    <id>https://neo01.com/zh-TW/2026/03/Database-Rust-Wire-Protocol-Result-Set/</id>
    <published>2026-03-04T16:00:00.000Z</published>
    <updated>2026-03-14T06:22:04.727Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-TW/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/">第四部分</a> 中，我們建構了 WAL 和崩潰恢復。我們的資料庫現在可以在停電中存活。但有個問題。</p><p><strong>客戶端實際上如何與我們的資料庫對話？</strong></p><pre class="language-none"><code class="language-none">┌─────────────┐                          ┌─────────────┐│   psql      │                          │  Vaultgres  ││   client    │                          │   server    ││             │     ??? How to talk ???  │             │└─────────────┘                          └─────────────┘</code></pre><p>我們可以發明自己的協定。但那樣我們就必須從頭建構客戶端。</p><p><strong>更好的方法：</strong> 說 PostgreSQL 的通訊協定。然後 <code>psql</code>、JDBC、libpq——所有現有工具——都能直接用。</p><p>今天：在 Rust 中實作 PostgreSQL 通訊協定，從啟動握手到結果集序列化。</p><hr /><h2 id="1-通訊協定概述">1 通訊協定概述</h2><h3 id="Frontend-Backend-模型">Frontend/Backend 模型</h3><p>PostgreSQL 使用 <strong>frontend/backend</strong> 架構：</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    PostgreSQL Protocol                       │├─────────────────────────────────────────────────────────────┤│                                                              ││  Frontend (Client)          Backend (Server)                ││  - psql                     - Vaultgres                     ││  - libpq (C driver)         - Query processor               ││  - JDBC&#x2F;ODBC              - Storage engine                 ││  - psycopg (Python)         - Transaction manager           ││                                                              ││  Communication: TCP&#x2F;IP (usually port 5432)                  ││  Message format: Length-prefixed binary protocol            ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="訊息結構">訊息結構</h3><p>每個訊息都有相同的格式：</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ Message Format                                              │├─────────────────────────────────────────────────────────────┤│ ┌─────────────┬─────────────────────────────────────────┐   ││ │ Type (1B)   │ Length (4B, includes itself)            │   ││ ├─────────────┴─────────────────────────────────────────┤   ││ │ Payload (variable)                                     │   ││ └─────────────────────────────────────────────────────────┘   │└─────────────────────────────────────────────────────────────┘Example: SimpleQuery (&#39;Q&#39;)┌─────────────────────────────────────────────────────────────┐│ &#39;Q&#39; │ 0x00 0x00 0x00 0x1A │ &quot;SELECT * FROM users\0&quot;        ││  1B │      4B (26 bytes)   │ variable (null-terminated)     │└─────────────────────────────────────────────────────────────┘</code></pre><p><strong>關鍵洞察：</strong> 長度是<strong>大端序</strong>（網路位元組順序）且<strong>包含自身</strong>（不包含類型位元組）。</p><hr /><h3 id="訊息類型">訊息類型</h3><table><thead><tr><th>類型</th><th>代碼</th><th>方向</th><th>目的</th></tr></thead><tbody><tr><td><strong>StartupMessage</strong></td><td>(none)</td><td>F→B</td><td>初始連接（無類型位元組）</td></tr><tr><td><strong>AuthenticationOk</strong></td><td>‘R’</td><td>B→F</td><td>登入成功</td></tr><tr><td><strong>Query</strong></td><td>‘Q’</td><td>F→B</td><td>簡單查詢（SQL 字串）</td></tr><tr><td><strong>RowDescription</strong></td><td>‘T’</td><td>B→F</td><td>欄位元資料</td></tr><tr><td><strong>DataRow</strong></td><td>‘D’</td><td>B→F</td><td>實際列資料</td></tr><tr><td><strong>CommandComplete</strong></td><td>‘C’</td><td>B→F</td><td>查詢完成</td></tr><tr><td><strong>ReadyForQuery</strong></td><td>‘Z’</td><td>B→F</td><td>伺服器準備好下一個查詢</td></tr><tr><td><strong>ErrorResponse</strong></td><td>‘E’</td><td>B→F</td><td>出錯了</td></tr><tr><td><strong>Parse</strong></td><td>‘P’</td><td>F→B</td><td>擴充查詢：準備</td></tr><tr><td><strong>Bind</strong></td><td>‘B’</td><td>F→B</td><td>擴充查詢：綁定參數</td></tr><tr><td><strong>Execute</strong></td><td>‘E’</td><td>F→B</td><td>擴充查詢：執行</td></tr><tr><td><strong>Sync</strong></td><td>‘S’</td><td>F→B</td><td>擴充查詢：完成批次</td></tr></tbody></table><p>F→B = Frontend to Backend, B→F = Backend to Frontend</p><hr /><h2 id="2-連接啟動">2 連接啟動</h2><h3 id="握手流程">握手流程</h3><pre class="language-MERMAID_BASE64_621" data-language="MERMAID_BASE64_621"><code class="language-MERMAID_BASE64_621">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBDbGllbnQKICAgIHBhcnRpY2lwYW50IFNlcnZlcgoKICAgIENsaWVudC0+PlNlcnZlcjogU3RhcnR1cE1lc3NhZ2UgKHVzZXIsIGRhdGFiYXNlLCBvcHRpb25zKQogICAgU2VydmVyLT4+Q2xpZW50OiBBdXRoZW50aWNhdGlvbk9rCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFBhcmFtZXRlclN0YXR1cyAoc2VydmVyX3ZlcnNpb24sIGVuY29kaW5nLCAuLi4pCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFJlYWR5Rm9yUXVlcnkgKGlkbGUpCiAgICAKICAgIENsaWVudC0+PlNlcnZlcjogUXVlcnkgLyBFeHRlbmRlZCBRdWVyeQogICAgU2VydmVyLT4+Q2xpZW50OiBSb3dEZXNjcmlwdGlvbiAoZm9yIFNFTEVDVCkKICAgIFNlcnZlci0+PkNsaWVudDogRGF0YVJvdyDDlyBOCiAgICBTZXJ2ZXItPj5DbGllbnQ6IENvbW1hbmRDb21wbGV0ZQogICAgU2VydmVyLT4+Q2xpZW50OiBSZWFkeUZvclF1ZXJ5IChpZGxlKQ&#x3D;&#x3D;</code></pre><hr /><h3 id="StartupMessage">StartupMessage</h3><p>第一個訊息很特殊——<strong>沒有類型位元組</strong>，只有長度：</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ StartupMessage                                              │├─────────────────────────────────────────────────────────────┤│ Length (4B): 8 + parameters                                 ││ Protocol Version (4B): 196608 (3.0)                         ││ Parameters (null-terminated key&#x3D;value pairs):               ││   &quot;user\0neo\0database\0vaultgres\0\0&quot;                      │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/startup.rs</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>io<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">AsyncReadExt</span><span class="token punctuation">,</span> <span class="token class-name">AsyncWriteExt</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>net<span class="token punctuation">::</span></span><span class="token class-name">TcpStream</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StartupMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> user<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> database<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StartupMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Read length (4 bytes, big-endian)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> len_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> len_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> len <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>len_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Read protocol version</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> version_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> version_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> version <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>version_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> version <span class="token operator">!=</span> <span class="token number">196608</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">::</span><span class="token class-name">UnsupportedVersion</span><span class="token punctuation">(</span>version<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Read parameters (null-terminated key=value pairs)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> params <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> remaining <span class="token operator">=</span> len <span class="token operator">-</span> <span class="token number">8</span><span class="token punctuation">;</span>  <span class="token comment">// Subtract length and version bytes</span>        <span class="token keyword">while</span> remaining <span class="token operator">></span> <span class="token number">1</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> key <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> byte <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>                        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> byte<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                remaining <span class="token operator">-=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>                key<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">if</span> key<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>  <span class="token comment">// Empty key = end of parameters</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> value <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> byte<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                remaining <span class="token operator">-=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>                value<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> key <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">from_utf8</span><span class="token punctuation">(</span>key<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> value <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">from_utf8</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            params<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>key<span class="token punctuation">,</span> value<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            user<span class="token punctuation">:</span> params<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token string">"user"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            database<span class="token punctuation">:</span> params<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token string">"database"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            options<span class="token punctuation">:</span> params<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Authentication-和-ParameterStatus">Authentication 和 ParameterStatus</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/messages.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">authentication_ok</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'R' (1B) + Length (4B) + Auth Type (4B = 0 for Ok)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'R'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">12u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Length</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">0u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>   <span class="token comment">// AuthOk</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parameter_status</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'S' (1B) + Length (4B) + name\0 + value\0</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'S'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> payload_len <span class="token operator">=</span> <span class="token number">4</span> <span class="token operator">+</span> name<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span> <span class="token operator">+</span> value<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>payload_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>name<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">ready_for_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> status<span class="token punctuation">:</span> <span class="token class-name">TransactionStatus</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'Z' (1B) + Length (4B) + Status (1B)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'Z'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">5u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>status <span class="token keyword">as</span> <span class="token keyword">u8</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy)]</span><span class="token attribute attr-name">#[repr(u8)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TransactionStatus</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Idle</span> <span class="token operator">=</span> <span class="token char">b'I'</span><span class="token punctuation">,</span>    <span class="token class-name">InTransaction</span> <span class="token operator">=</span> <span class="token char">b'T'</span><span class="token punctuation">,</span>    <span class="token class-name">InFailedTransaction</span> <span class="token operator">=</span> <span class="token char">b'E'</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>伺服器發送這些參數：</strong></p><table><thead><tr><th>參數</th><th>值</th><th>目的</th></tr></thead><tbody><tr><td><code>server_version</code></td><td><code>16.0</code></td><td>我們模擬的 PostgreSQL 版本</td></tr><tr><td><code>server_encoding</code></td><td><code>UTF8</code></td><td>字元編碼</td></tr><tr><td><code>client_encoding</code></td><td><code>UTF8</code></td><td>客戶端的編碼</td></tr><tr><td><code>integer_datetimes</code></td><td><code>on</code></td><td>64 位元整數時間戳</td></tr></tbody></table><hr /><h2 id="3-簡單查詢協定">3 簡單查詢協定</h2><h3 id="查詢流程">查詢流程</h3><pre class="language-none"><code class="language-none">Client: Query(&quot;SELECT id, name FROM users WHERE id &#x3D; 1&quot;)Server: RowDescription (column metadata)Server: DataRow (row 1)Server: DataRow (row 2)...Server: CommandComplete (&quot;SELECT 2&quot;)Server: ReadyForQuery (&#39;I&#39;)</code></pre><hr /><h3 id="RowDescription：告訴客戶端關於欄位">RowDescription：告訴客戶端關於欄位</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/row_description.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">FieldDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> table_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> column_attr_num<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_size<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_modifier<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> format_code<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>  <span class="token comment">// 0 = text, 1 = binary</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">RowDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> fields<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FieldDescription</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">RowDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'T' (1B) + Length (4B) + Num Fields (2B) + Fields...</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'T'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Calculate payload length</span>        <span class="token keyword">let</span> payload_len <span class="token operator">=</span> <span class="token number">2</span> <span class="token operator">+</span> <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token number">19</span><span class="token punctuation">)</span> <span class="token operator">+</span>             <span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>f<span class="token closure-punctuation punctuation">|</span></span> f<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>payload_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i16</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> field <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields <span class="token punctuation">&#123;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>field<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Null terminator</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>table_oid<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>column_attr_num<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_oid<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_size<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_modifier<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>format_code<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>範例輸出：</strong></p><pre class="language-none"><code class="language-none">SELECT id, name FROM usersRowDescription:┌─────────────────────────────────────────────────────────────┐│ &#39;T&#39; │ Length │ 2 fields                                     │├─────────────────────────────────────────────────────────────┤│ Field 1: &quot;id&quot;                                               ││   table_oid: 16384                                          ││   column_attr_num: 1                                        ││   type_oid: 23 (INT4)                                       ││   type_size: 4                                              ││   type_modifier: -1                                         ││   format_code: 0 (text)                                     │├─────────────────────────────────────────────────────────────┤│ Field 2: &quot;name&quot;                                             ││   table_oid: 16384                                          ││   column_attr_num: 2                                        ││   type_oid: 25 (TEXT)                                       ││   type_size: -1 (variable)                                  ││   type_modifier: -1                                         ││   format_code: 0 (text)                                     │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="DataRow：序列化實際列">DataRow：序列化實際列</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/data_row.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DataRow</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// None = NULL</span>    <span class="token keyword">pub</span> format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">DataRow</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'D' (1B) + Length (4B) + Num Values (2B) + Values...</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Calculate payload length</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> payload_len <span class="token operator">=</span> <span class="token number">2u32</span><span class="token punctuation">;</span>  <span class="token comment">// Num values</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            payload_len <span class="token operator">+=</span> <span class="token number">4</span><span class="token punctuation">;</span>  <span class="token comment">// Length prefix</span>            <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=</span> value <span class="token punctuation">&#123;</span>                payload_len <span class="token operator">+=</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>payload_len<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i16</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            <span class="token keyword">match</span> value <span class="token punctuation">&#123;</span>                <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// NULL: length = -1</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Non-NULL: length + data</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>範例：</strong></p><pre class="language-none"><code class="language-none">Row: id&#x3D;1, name&#x3D;&quot;Alice&quot;, email&#x3D;NULLDataRow:┌─────────────────────────────────────────────────────────────┐│ &#39;D&#39; │ Length │ 3 values                                     │├─────────────────────────────────────────────────────────────┤│ Value 1: 4 bytes │ &quot;1&quot;                                      ││ Value 2: 5 bytes │ &quot;Alice&quot;                                  ││ Value 3: -1 (NULL)                                          │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="文字-vs-二進位格式">文字 vs. 二進位格式</h3><p><strong>文字格式（format_code = 0）：</strong> 可讀字串</p><pre class="language-none"><code class="language-none">INT4: &quot;42&quot;TEXT: &quot;Alice&quot;TIMESTAMP: &quot;2026-03-29 14:30:00.123456+00&quot;</code></pre><p><strong>二進位格式（format_code = 1）：</strong> 原生表示</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/type_encoding.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_int4</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">into_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Text</span>        <span class="token number">1</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token comment">// Binary</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_text</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>       <span class="token comment">// Text (UTF-8)</span>        <span class="token number">1</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Binary: 4-byte length prefix + data</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> buf <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_timestamp</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">format</span><span class="token punctuation">(</span><span class="token string">"%Y-%m-%d %H:%M:%S%.6f%z"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">into_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token number">1</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// PostgreSQL epoch: 2000-01-01 00:00:00 UTC</span>            <span class="token keyword">let</span> epoch <span class="token operator">=</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token punctuation">::</span><span class="token function">from_timestamp</span><span class="token punctuation">(</span><span class="token number">946684800</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> micros <span class="token operator">=</span> value<span class="token punctuation">.</span><span class="token function">signed_duration_since</span><span class="token punctuation">(</span>epoch<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">num_microseconds</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            micros<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-擴充查詢協定">4 擴充查詢協定</h2><h3 id="為什麼需要擴充查詢？">為什麼需要擴充查詢？</h3><p><strong>簡單查詢：</strong> SQL 注入風險，無預備語句</p><pre class="language-none"><code class="language-none">Client: Query(&quot;SELECT * FROM users WHERE id &#x3D; &quot; + user_input)→ SQL injection vulnerability!</code></pre><p><strong>擴充查詢：</strong> 預備語句，參數綁定</p><pre class="language-none"><code class="language-none">Client: Parse(&quot;SELECT * FROM users WHERE id &#x3D; $1&quot;)Client: Bind([42])Client: Execute()→ Safe from SQL injection!</code></pre><hr /><h3 id="擴充查詢流程">擴充查詢流程</h3><pre class="language-MERMAID_BASE64_622" data-language="MERMAID_BASE64_622"><code class="language-MERMAID_BASE64_622">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBDbGllbnQKICAgIHBhcnRpY2lwYW50IFNlcnZlcgoKICAgIENsaWVudC0+PlNlcnZlcjogUGFyc2UgKFNRTCwgcGFyYW1ldGVyIHR5cGVzKQogICAgU2VydmVyLT4+Q2xpZW50OiBQYXJzZUNvbXBsZXRlCgogICAgQ2xpZW50LT4+U2VydmVyOiBCaW5kIChwYXJhbWV0ZXIgdmFsdWVzKQogICAgU2VydmVyLT4+Q2xpZW50OiBCaW5kQ29tcGxldGUKCiAgICBsb29wIE11bHRpcGxlIGV4ZWN1dGlvbnMKICAgICAgICBDbGllbnQtPj5TZXJ2ZXI6IEV4ZWN1dGUgKG1heF9yb3dzKQogICAgICAgIFNlcnZlci0+PkNsaWVudDogRGF0YVJvdyDDlyBOCiAgICBlbmQKCiAgICBDbGllbnQtPj5TZXJ2ZXI6IFN5bmMKICAgIFNlcnZlci0+PkNsaWVudDogQ29tbWFuZENvbXBsZXRlCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFJlYWR5Rm9yUXVlcnk&#x3D;</code></pre><hr /><h3 id="Parse：準備語句">Parse：準備語句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/parse.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ParseMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> statement_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> query<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_types<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// OID for each parameter</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ParseMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// statement_name (null-terminated)</span>        <span class="token keyword">let</span> statement_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// query (null-terminated)</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// num_parameter_types (2B)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> num_types_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> num_types_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> num_types <span class="token operator">=</span> <span class="token keyword">i16</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>num_types_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_types (4B each)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_types <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_types <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> type_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> type_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            parameter_types<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>type_buf<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            statement_name<span class="token punctuation">,</span>            query<span class="token punctuation">,</span>            parameter_types<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parse_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// '1' (1B) + Length (4B = 4)</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'1'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">4u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Bind：建立-Portal">Bind：建立 Portal</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/bind.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">BindMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> portal_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> statement_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> result_format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">BindMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// portal_name (null-terminated)</span>        <span class="token keyword">let</span> portal_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// statement_name (null-terminated)</span>        <span class="token keyword">let</span> statement_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// num_parameter_format_codes (2B)</span>        <span class="token keyword">let</span> num_formats <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_format_codes</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_format_codes <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_formats <span class="token punctuation">&#123;</span>            parameter_format_codes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// num_parameter_values (2B)</span>        <span class="token keyword">let</span> num_values <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_values <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_values <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> len <span class="token operator">=</span> <span class="token function">read_i32</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> len <span class="token operator">==</span> <span class="token operator">-</span><span class="token number">1</span> <span class="token punctuation">&#123;</span>                parameter_values<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// NULL</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> data <span class="token operator">=</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> len <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> data<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                parameter_values<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// num_result_format_codes (2B)</span>        <span class="token keyword">let</span> num_result_formats <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// result_format_codes</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> result_format_codes <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_result_formats <span class="token punctuation">&#123;</span>            result_format_codes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            portal_name<span class="token punctuation">,</span>            statement_name<span class="token punctuation">,</span>            parameter_format_codes<span class="token punctuation">,</span>            parameter_values<span class="token punctuation">,</span>            result_format_codes<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">bind_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// '2' (1B) + Length (4B = 4)</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'2'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">4u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Execute：執行預備語句">Execute：執行預備語句</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/execute.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ExecuteMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> portal_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_rows<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span>  <span class="token comment">// 0 = all rows</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ExecuteMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> portal_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> max_rows <span class="token operator">=</span> <span class="token function">read_i32</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> portal_name<span class="token punctuation">,</span> max_rows <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>伺服器回應：</strong> DataRow 訊息（沒有特定的 “ExecuteComplete” 訊息）</p><hr /><h3 id="Sync：完成批次">Sync：完成批次</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/sync.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SyncMessage</span><span class="token punctuation">;</span><span class="token keyword">impl</span> <span class="token class-name">SyncMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>_stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Sync has no body, just the message header</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">SyncMessage</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">sync_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">,</span> status<span class="token punctuation">:</span> <span class="token class-name">TransactionStatus</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// CommandComplete + ReadyForQuery</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// CommandComplete: 'C' + Length + "SELECT 2\0"</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'C'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> cmd <span class="token operator">=</span> <span class="token string">b"SELECT 2"</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token punctuation">(</span>cmd<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>cmd<span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-完整的查詢執行流程">5 完整的查詢執行流程</h2><h3 id="整合在一起">整合在一起</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/handler.rs</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>net<span class="token punctuation">::</span></span><span class="token class-name">TcpStream</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>query_executor<span class="token punctuation">::</span></span><span class="token class-name">QueryExecutor</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>storage<span class="token punctuation">::</span>buffer_pool<span class="token punctuation">::</span></span><span class="token class-name">BufferPool</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    stream<span class="token punctuation">:</span> <span class="token class-name">TcpStream</span><span class="token punctuation">,</span>    executor<span class="token punctuation">:</span> <span class="token class-name">QueryExecutor</span><span class="token punctuation">,</span>    builder<span class="token punctuation">:</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">,</span>    prepared_statements<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">PreparedStatement</span><span class="token operator">></span><span class="token punctuation">,</span>    portals<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Portal</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_connection</span><span class="token punctuation">(</span><span class="token keyword">mut</span> stream<span class="token punctuation">:</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Read startup message</span>        <span class="token keyword">let</span> startup <span class="token operator">=</span> <span class="token class-name">StartupMessage</span><span class="token punctuation">::</span><span class="token function">read_from</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 2. Send authentication</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">authentication_ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 3. Send parameter status</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_version"</span><span class="token punctuation">,</span> <span class="token string">"16.0"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"client_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 4. Send ready for query</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 5. Main message loop</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> type_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> type_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                        <span class="token keyword">match</span> type_buf<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token keyword">as</span> <span class="token keyword">char</span> <span class="token punctuation">&#123;</span>                <span class="token char">'Q'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_simple_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'P'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_parse</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'B'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_bind</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'E'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'S'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_sync</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'X'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Terminate</span>                    <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                _ <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">::</span><span class="token class-name">UnknownMessage</span><span class="token punctuation">(</span>type_buf<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_simple_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Read query string</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Execute query</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Send RowDescription (if SELECT)</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>columns<span class="token punctuation">)</span> <span class="token operator">=</span> result<span class="token punctuation">.</span>columns <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> row_desc <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_row_description</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>columns<span class="token punctuation">)</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>row_desc<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                        <span class="token comment">// Send DataRows</span>            <span class="token keyword">for</span> row <span class="token keyword">in</span> result<span class="token punctuation">.</span>rows <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> data_row <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_data_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>row<span class="token punctuation">)</span><span class="token punctuation">;</span>                stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>data_row<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Send CommandComplete</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">command_complete</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>result<span class="token punctuation">.</span>command_tag<span class="token punctuation">)</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Send ReadyForQuery</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="結果集序列化範例">結果集序列化範例</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/result_set.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Column</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> rows<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Row</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> command_tag<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Column</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_size<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Row</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">>></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">send_to</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// RowDescription</span>        <span class="token keyword">let</span> fields<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FieldDescription</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>col<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>            <span class="token class-name">FieldDescription</span> <span class="token punctuation">&#123;</span>                name<span class="token punctuation">:</span> col<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                table_oid<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                column_attr_num<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                type_oid<span class="token punctuation">:</span> col<span class="token punctuation">.</span>type_oid<span class="token punctuation">,</span>                type_size<span class="token punctuation">:</span> col<span class="token punctuation">.</span>type_size<span class="token punctuation">,</span>                type_modifier<span class="token punctuation">:</span> <span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">,</span>                format_code<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Text format</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> row_desc <span class="token operator">=</span> <span class="token class-name">RowDescription</span> <span class="token punctuation">&#123;</span> fields <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>row_desc<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// DataRows</span>        <span class="token keyword">for</span> row <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>rows <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span> <span class="token operator">=</span> row<span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>s<span class="token closure-punctuation punctuation">|</span></span> s<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> data_row <span class="token operator">=</span> <span class="token class-name">DataRow</span> <span class="token punctuation">&#123;</span>                values<span class="token punctuation">,</span>                format_codes<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>data_row<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// CommandComplete</span>        builder<span class="token punctuation">.</span><span class="token function">command_complete</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>command_tag<span class="token punctuation">)</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage example</span><span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span>        <span class="token class-name">Column</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"id"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> type_oid<span class="token punctuation">:</span> <span class="token number">23</span><span class="token punctuation">,</span> type_size<span class="token punctuation">:</span> <span class="token number">4</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token class-name">Column</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"name"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> type_oid<span class="token punctuation">:</span> <span class="token number">25</span><span class="token punctuation">,</span> type_size<span class="token punctuation">:</span> <span class="token operator">-</span><span class="token number">1</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    rows<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span>        <span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> values<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"1"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"Alice"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">]</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> values<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"2"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"Bob"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">]</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    command_tag<span class="token punctuation">:</span> <span class="token string">"SELECT 2"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span>result<span class="token punctuation">.</span><span class="token function">send_to</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> builder<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span></code></pre><p><strong>psql 接收的內容：</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ T (RowDescription)                                          ││   2 columns: id (INT4), name (TEXT)                         │├─────────────────────────────────────────────────────────────┤│ D (DataRow)                                                 ││   id&#x3D;1, name&#x3D;&quot;Alice&quot;                                        │├─────────────────────────────────────────────────────────────┤│ D (DataRow)                                                 ││   id&#x3D;2, name&#x3D;&quot;Bob&quot;                                          │├─────────────────────────────────────────────────────────────┤│ C (CommandComplete)                                         ││   &quot;SELECT 2&quot;                                                │├─────────────────────────────────────────────────────────────┤│ Z (ReadyForQuery)                                           ││   Status: Idle                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="6-PostgreSQL-類型-OID">6 PostgreSQL 類型 OID</h2><h3 id="常見類型">常見類型</h3><table><thead><tr><th>類型名稱</th><th>OID</th><th>大小</th><th>說明</th></tr></thead><tbody><tr><td><code>BOOL</code></td><td>16</td><td>1</td><td>布林值</td></tr><tr><td><code>INT2</code> (SMALLINT)</td><td>21</td><td>2</td><td>2 位元組整數</td></tr><tr><td><code>INT4</code> (INTEGER)</td><td>23</td><td>4</td><td>4 位元組整數</td></tr><tr><td><code>INT8</code> (BIGINT)</td><td>20</td><td>8</td><td>8 位元組整數</td></tr><tr><td><code>TEXT</code></td><td>25</td><td>-1</td><td>可變長度文字</td></tr><tr><td><code>VARCHAR</code></td><td>1043</td><td>-1</td><td>可變長度字元</td></tr><tr><td><code>TIMESTAMP</code></td><td>1114</td><td>8</td><td>無時區時間戳</td></tr><tr><td><code>TIMESTAMPTZ</code></td><td>1184</td><td>8</td><td>有時區時間戳</td></tr><tr><td><code>FLOAT4</code> (REAL)</td><td>700</td><td>4</td><td>4 位元組浮點數</td></tr><tr><td><code>FLOAT8</code> (DOUBLE)</td><td>701</td><td>8</td><td>8 位元組浮點數</td></tr><tr><td><code>NUMERIC</code></td><td>1700</td><td>-1</td><td>任意精度</td></tr><tr><td><code>BYTEA</code></td><td>17</td><td>-1</td><td>二進位資料</td></tr><tr><td><code>OID</code></td><td>26</td><td>4</td><td>物件識別符</td></tr></tbody></table><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/oids.rs</span><span class="token keyword">pub</span> <span class="token keyword">mod</span> <span class="token module-declaration namespace">oid</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">BOOL</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">16</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT2</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">21</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT4</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">23</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT8</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">20</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TEXT</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">25</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">VARCHAR</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1043</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TIMESTAMP</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1114</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TIMESTAMPTZ</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1184</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FLOAT4</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">700</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FLOAT8</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">701</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">NUMERIC</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1700</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">BYTEA</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">17</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">OID</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">26</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-用-Rust-建構的挑戰">7 用 Rust 建構的挑戰</h2><h3 id="挑戰-1：非同步-I-O-和借用">挑戰 1：非同步 I/O 和借用</h3><p><strong>問題：</strong> tokio 需要 <code>&amp;mut self</code> 進行非同步 I/O，但我們需要從 self 借用。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Borrows self</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Also borrows self!</span>        <span class="token comment">// Error: cannot borrow as mutable more than once</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：重構以避免同時借用</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Release borrow before next operation</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">send_result</span><span class="token punctuation">(</span>result<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-2：零拷貝-vs-配置">挑戰 2：零拷貝 vs. 配置</h3><p><strong>問題：</strong> 通訊協定訊息需要序列化。拷貝很昂貴。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Allocates on every message</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> buffer <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ... lots of allocations ...</span>    buffer<span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：重用緩衝區</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Reuses allocated buffer</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Pre-allocated, reused</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">with_capacity</span><span class="token punctuation">(</span>capacity<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">with_capacity</span><span class="token punctuation">(</span>capacity<span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">data_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> row<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Row</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Reuse capacity</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// ... write to buffer ...</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer  <span class="token comment">// Return reference, not owned</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-3：跨層錯誤處理">挑戰 3：跨層錯誤處理</h3><p><strong>問題：</strong> 通訊協定錯誤、查詢錯誤、儲存錯誤——所有不同的類型。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Error type explosion</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Error</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Io</span><span class="token punctuation">(</span><span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Protocol</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Query</span><span class="token punctuation">(</span><span class="token class-name">QueryError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Storage</span><span class="token punctuation">(</span><span class="token class-name">StorageError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Transaction</span><span class="token punctuation">(</span><span class="token class-name">TransactionError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token comment">// ... 20 more variants ...</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解決方案：使用 <code>thiserror</code> 和轉換特性</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean error handling</span><span class="token attribute attr-name">#[derive(Debug, thiserror::Error)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ProtocolError</span> <span class="token punctuation">&#123;</span>    <span class="token attribute attr-name">#[error(<span class="token string">"IO error: &#123;0&#125;"</span>)]</span>    <span class="token class-name">Io</span><span class="token punctuation">(</span><span class="token attribute attr-name">#[from]</span> <span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token attribute attr-name">#[error(<span class="token string">"Invalid message type: &#123;0&#125;"</span>)]</span>    <span class="token class-name">UnknownMessage</span><span class="token punctuation">(</span><span class="token keyword">u8</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token attribute attr-name">#[error(<span class="token string">"Query error: &#123;0&#125;"</span>)]</span>    <span class="token class-name">Query</span><span class="token punctuation">(</span><span class="token attribute attr-name">#[from]</span> <span class="token class-name">QueryError</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// Use ? operator for automatic conversion</span><span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// io::Error → ProtocolError</span>    <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// QueryError → ProtocolError</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-AI-如何加速這項工作">8 AI 如何加速這項工作</h2><h3 id="AI-做對了什麼">AI 做對了什麼</h3><table><thead><tr><th>任務</th><th>AI 貢獻</th></tr></thead><tbody><tr><td><strong>訊息格式</strong></td><td>正確的大端序編碼</td></tr><tr><td><strong>擴充查詢流程</strong></td><td>Parse → Bind → Execute 順序</td></tr><tr><td><strong>類型 OID</strong></td><td>準確的 PostgreSQL 類型 OID</td></tr><tr><td><strong>NULL 處理</strong></td><td>NULL 的 -1 長度前綴</td></tr></tbody></table><hr /><h3 id="AI-做錯了什麼">AI 做錯了什麼</h3><table><thead><tr><th>問題</th><th>發生什麼事</th></tr></thead><tbody><tr><td><strong>長度計算</strong></td><td>初稿沒有在長度中包含長度位元組</td></tr><tr><td><strong>啟動訊息</strong></td><td>嘗試添加類型位元組（啟動沒有！）</td></tr><tr><td><strong>二進位格式</strong></td><td>建議小端序（PostgreSQL 使用大端序）</td></tr><tr><td><strong>Portal 生命週期</strong></td><td>忽略了 portal 在 Execute 後被銷毀</td></tr></tbody></table><p><strong>模式：</strong> 通訊協定很精確。差一錯誤會破壞一切。</p><hr /><h3 id="範例：除錯-psql-連接">範例：除錯 psql 連接</h3><p><strong>我問 AI 的問題：</strong></p><blockquote><p>“psql 連接但立即斷開。什麼錯了？”</p></blockquote><p><strong>我學到的：</strong></p><ol><li>psql 期望特定的 ParameterStatus 訊息</li><li>缺少 <code>server_version</code> 會導致無聲斷開</li><li>ReadyForQuery 必須在驗證後發送</li></ol><p><strong>結果：</strong> 添加了必需的參數：</p><pre class="language-rust" data-language="rust"><code class="language-rust">stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_version"</span><span class="token punctuation">,</span> <span class="token string">"16.0"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"client_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span></code></pre><p><strong>現在 psql 連接成功！</strong></p><pre class="language-none"><code class="language-none">$ psql -h localhost -p 5432 -U neo vaultgrespsql (16.0, server 16.0 (Vaultgres))Type &quot;help&quot; for help.vaultgres&#x3D;&gt; SELECT 1; ?column? ----------        1(1 row)</code></pre><hr /><h2 id="總結：通訊協定一張圖">總結：通訊協定一張圖</h2><pre class="language-MERMAID_BASE64_623" data-language="MERMAID_BASE64_623"><code class="language-MERMAID_BASE64_623">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiQ29ubmVjdGlvbiBTdGFydHVwIgogICAgICAgIEFbQ2xpZW50IGNvbm5lY3RzXSAtLT4gQltTdGFydHVwTWVzc2FnZV0KICAgICAgICBCIC0tPiBDW0F1dGhlbnRpY2F0aW9uT2tdCiAgICAgICAgQyAtLT4gRFtQYXJhbWV0ZXJTdGF0dXNdCiAgICAgICAgRCAtLT4gRVtSZWFkeUZvclF1ZXJ5XQogICAgZW5kCgogICAgc3ViZ3JhcGggIlNpbXBsZSBRdWVyeSIKICAgICAgICBGW1F1ZXJ5ICdRJ10gLS0+IEdbUm93RGVzY3JpcHRpb24gJ1QnXQogICAgICAgIEcgLS0+IEhbRGF0YVJvdyAnRCcgw5cgTl0KICAgICAgICBIIC0tPiBJW0NvbW1hbmRDb21wbGV0ZSAnQyddCiAgICAgICAgSSAtLT4gRQogICAgZW5kCgogICAgc3ViZ3JhcGggIkV4dGVuZGVkIFF1ZXJ5IgogICAgICAgIEpbUGFyc2UgJ1AnXSAtLT4gS1tQYXJzZUNvbXBsZXRlICcxJ10KICAgICAgICBLIC0tPiBMW0JpbmQgJ0InXQogICAgICAgIEwgLS0+IE1bQmluZENvbXBsZXRlICcyJ10KICAgICAgICBNIC0tPiBOW0V4ZWN1dGUgJ0UnXQogICAgICAgIE4gLS0+IEgKICAgICAgICBPW1N5bmMgJ1MnXSAtLT4gSQogICAgZW5kCgogICAgc3ViZ3JhcGggIk1lc3NhZ2UgRm9ybWF0IgogICAgICAgIFBbVHlwZSAxQl0gLS0+IFFbTGVuZ3RoIDRCIEJFXQogICAgICAgIFEgLS0+IFJbUGF5bG9hZF0KICAgIGVuZAoKICAgIHN1YmdyYXBoICJSZXN1bHQgU2VyaWFsaXphdGlvbiIKICAgICAgICBTW1JvdzogaWQ9MSwgbmFtZT0nQWxpY2UnXSAtLT4gVFtEYXRhUm93OiAnRCcgKyBsZW4gKyB2YWx1ZXNdCiAgICBlbmQKCiAgICBzdHlsZSBCIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgRiBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEogZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNjCiAgICBzdHlsZSBQIGZpbGw6I2ZmZjNlMCxzdHJva2U6I2Y1N2MwMA&#x3D;&#x3D;</code></pre><p><strong>關鍵要點：</strong></p><table><thead><tr><th>概念</th><th>為什麼重要</th></tr></thead><tbody><tr><td><strong>通訊協定</strong></td><td>與現有 PostgreSQL 工具相容</td></tr><tr><td><strong>訊息框架</strong></td><td>長度前綴二進位協定</td></tr><tr><td><strong>簡單 vs. 擴充</strong></td><td>快速查詢 vs. 預備語句</td></tr><tr><td><strong>RowDescription</strong></td><td>客戶端的欄位元資料</td></tr><tr><td><strong>DataRow</strong></td><td>實際列資料（文字或二進位）</td></tr><tr><td><strong>類型 OID</strong></td><td>PostgreSQL 類型識別</td></tr><tr><td><strong>NULL 編碼</strong></td><td>-1 長度前綴</td></tr></tbody></table><hr /><p><strong>進一步閱讀：</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c"><code>src/backend/tcop/postgres.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/include/libpq/pqformat.h"><code>src/include/libpq/pqformat.h</code></a></li><li>“PostgreSQL Wire Protocol” documentation: <a href="https://www.postgresql.org/docs/current/protocol.html">https://www.postgresql.org/docs/current/protocol.html</a></li><li>libpq source: <a href="https://github.com/postgres/postgres/tree/master/src/interfaces/libpq"><code>src/interfaces/libpq/</code></a></li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第五部分：實作 PostgreSQL 通訊協定。深入探討訊息框架、啟動握手、擴充查詢協定，以及序列化 psql 和驅動程式能理解的結果集。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>Database in Rust: Wire Protocol and Result Set Serialization</title>
    <link href="https://neo01.com/2026/03/Database-Rust-Wire-Protocol-Result-Set/"/>
    <id>https://neo01.com/2026/03/Database-Rust-Wire-Protocol-Result-Set/</id>
    <published>2026-03-04T16:00:00.000Z</published>
    <updated>2026-03-14T06:22:07.840Z</updated>
    
    <content type="html"><![CDATA[<p>In <a href="/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/">Part 4</a>, we built WAL and crash recovery. Our database can now survive power failures. But there’s a problem.</p><p><strong>How do clients actually talk to our database?</strong></p><pre class="language-none"><code class="language-none">┌─────────────┐                          ┌─────────────┐│   psql      │                          │  Vaultgres  ││   client    │                          │   server    ││             │     ??? How to talk ???  │             │└─────────────┘                          └─────────────┘</code></pre><p>We could invent our own protocol. But then we’d need to build a client from scratch.</p><p><strong>Better approach:</strong> Speak PostgreSQL’s wire protocol. Then <code>psql</code>, JDBC, libpq—all existing tools—just work.</p><p>Today: implementing the PostgreSQL wire protocol in Rust, from startup handshake to result set serialization.</p><hr /><h2 id="1-The-Wire-Protocol-Overview">1 The Wire Protocol Overview</h2><h3 id="Frontend-Backend-Model">Frontend/Backend Model</h3><p>PostgreSQL uses a <strong>frontend/backend</strong> architecture:</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    PostgreSQL Protocol                       │├─────────────────────────────────────────────────────────────┤│                                                              ││  Frontend (Client)          Backend (Server)                ││  - psql                     - Vaultgres                     ││  - libpq (C driver)         - Query processor               ││  - JDBC&#x2F;ODBC              - Storage engine                 ││  - psycopg (Python)         - Transaction manager           ││                                                              ││  Communication: TCP&#x2F;IP (usually port 5432)                  ││  Message format: Length-prefixed binary protocol            ││                                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="Message-Structure">Message Structure</h3><p>Every message has the same format:</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ Message Format                                              │├─────────────────────────────────────────────────────────────┤│ ┌─────────────┬─────────────────────────────────────────┐   ││ │ Type (1B)   │ Length (4B, includes itself)            │   ││ ├─────────────┴─────────────────────────────────────────┤   ││ │ Payload (variable)                                     │   ││ └─────────────────────────────────────────────────────────┘   │└─────────────────────────────────────────────────────────────┘Example: SimpleQuery (&#39;Q&#39;)┌─────────────────────────────────────────────────────────────┐│ &#39;Q&#39; │ 0x00 0x00 0x00 0x1A │ &quot;SELECT * FROM users\0&quot;        ││  1B │      4B (26 bytes)   │ variable (null-terminated)     │└─────────────────────────────────────────────────────────────┘</code></pre><p><strong>Key insight:</strong> Length is <strong>big-endian</strong> (network byte order) and <strong>includes itself</strong> (not the type byte).</p><hr /><h3 id="Message-Types">Message Types</h3><table><thead><tr><th>Type</th><th>Code</th><th>Direction</th><th>Purpose</th></tr></thead><tbody><tr><td><strong>StartupMessage</strong></td><td>(none)</td><td>F→B</td><td>Initial connection (no type byte)</td></tr><tr><td><strong>AuthenticationOk</strong></td><td>‘R’</td><td>B→F</td><td>Login successful</td></tr><tr><td><strong>Query</strong></td><td>‘Q’</td><td>F→B</td><td>Simple query (SQL string)</td></tr><tr><td><strong>RowDescription</strong></td><td>‘T’</td><td>B→F</td><td>Column metadata</td></tr><tr><td><strong>DataRow</strong></td><td>‘D’</td><td>B→F</td><td>Actual row data</td></tr><tr><td><strong>CommandComplete</strong></td><td>‘C’</td><td>B→F</td><td>Query finished</td></tr><tr><td><strong>ReadyForQuery</strong></td><td>‘Z’</td><td>B→F</td><td>Server ready for next query</td></tr><tr><td><strong>ErrorResponse</strong></td><td>‘E’</td><td>B→F</td><td>Something went wrong</td></tr><tr><td><strong>Parse</strong></td><td>‘P’</td><td>F→B</td><td>Extended query: prepare</td></tr><tr><td><strong>Bind</strong></td><td>‘B’</td><td>F→B</td><td>Extended query: bind parameters</td></tr><tr><td><strong>Execute</strong></td><td>‘E’</td><td>F→B</td><td>Extended query: run</td></tr><tr><td><strong>Sync</strong></td><td>‘S’</td><td>F→B</td><td>Extended query: finish batch</td></tr></tbody></table><p>F→B = Frontend to Backend, B→F = Backend to Frontend</p><hr /><h2 id="2-Connection-Startup">2 Connection Startup</h2><h3 id="The-Handshake-Flow">The Handshake Flow</h3><pre class="language-MERMAID_BASE64_624" data-language="MERMAID_BASE64_624"><code class="language-MERMAID_BASE64_624">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBDbGllbnQKICAgIHBhcnRpY2lwYW50IFNlcnZlcgoKICAgIENsaWVudC0+PlNlcnZlcjogU3RhcnR1cE1lc3NhZ2UgKHVzZXIsIGRhdGFiYXNlLCBvcHRpb25zKQogICAgU2VydmVyLT4+Q2xpZW50OiBBdXRoZW50aWNhdGlvbk9rCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFBhcmFtZXRlclN0YXR1cyAoc2VydmVyX3ZlcnNpb24sIGVuY29kaW5nLCAuLi4pCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFJlYWR5Rm9yUXVlcnkgKGlkbGUpCiAgICAKICAgIENsaWVudC0+PlNlcnZlcjogUXVlcnkgLyBFeHRlbmRlZCBRdWVyeQogICAgU2VydmVyLT4+Q2xpZW50OiBSb3dEZXNjcmlwdGlvbiAoZm9yIFNFTEVDVCkKICAgIFNlcnZlci0+PkNsaWVudDogRGF0YVJvdyDDlyBOCiAgICBTZXJ2ZXItPj5DbGllbnQ6IENvbW1hbmRDb21wbGV0ZQogICAgU2VydmVyLT4+Q2xpZW50OiBSZWFkeUZvclF1ZXJ5IChpZGxlKQ&#x3D;&#x3D;</code></pre><hr /><h3 id="StartupMessage">StartupMessage</h3><p>The first message is special—<strong>no type byte</strong>, just length:</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ StartupMessage                                              │├─────────────────────────────────────────────────────────────┤│ Length (4B): 8 + parameters                                 ││ Protocol Version (4B): 196608 (3.0)                         ││ Parameters (null-terminated key&#x3D;value pairs):               ││   &quot;user\0neo\0database\0vaultgres\0\0&quot;                      │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/startup.rs</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>io<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">AsyncReadExt</span><span class="token punctuation">,</span> <span class="token class-name">AsyncWriteExt</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>net<span class="token punctuation">::</span></span><span class="token class-name">TcpStream</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">StartupMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> user<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> database<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> options<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">String</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">StartupMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Read length (4 bytes, big-endian)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> len_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> len_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> len <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>len_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Read protocol version</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> version_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> version_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> version <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>version_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> version <span class="token operator">!=</span> <span class="token number">196608</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">::</span><span class="token class-name">UnsupportedVersion</span><span class="token punctuation">(</span>version<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Read parameters (null-terminated key=value pairs)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> params <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> remaining <span class="token operator">=</span> len <span class="token operator">-</span> <span class="token number">8</span><span class="token punctuation">;</span>  <span class="token comment">// Subtract length and version bytes</span>        <span class="token keyword">while</span> remaining <span class="token operator">></span> <span class="token number">1</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> key <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> byte <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>                        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> byte<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                remaining <span class="token operator">-=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>                key<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">if</span> key<span class="token punctuation">.</span><span class="token function">is_empty</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>  <span class="token comment">// Empty key = end of parameters</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> value <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> byte<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                remaining <span class="token operator">-=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">if</span> byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">==</span> <span class="token number">0</span> <span class="token punctuation">&#123;</span> <span class="token keyword">break</span><span class="token punctuation">;</span> <span class="token punctuation">&#125;</span>                value<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>byte<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">let</span> key <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">from_utf8</span><span class="token punctuation">(</span>key<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> value <span class="token operator">=</span> <span class="token class-name">String</span><span class="token punctuation">::</span><span class="token function">from_utf8</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            params<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>key<span class="token punctuation">,</span> value<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            user<span class="token punctuation">:</span> params<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token string">"user"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            database<span class="token punctuation">:</span> params<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span><span class="token string">"database"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            options<span class="token punctuation">:</span> params<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Authentication-and-ParameterStatus">Authentication and ParameterStatus</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/messages.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">authentication_ok</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'R' (1B) + Length (4B) + Auth Type (4B = 0 for Ok)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'R'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">12u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Length</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">0u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>   <span class="token comment">// AuthOk</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parameter_status</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> name<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'S' (1B) + Length (4B) + name\0 + value\0</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'S'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> payload_len <span class="token operator">=</span> <span class="token number">4</span> <span class="token operator">+</span> name<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span> <span class="token operator">+</span> value<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>payload_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>name<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">ready_for_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> status<span class="token punctuation">:</span> <span class="token class-name">TransactionStatus</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'Z' (1B) + Length (4B) + Status (1B)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'Z'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">5u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span>status <span class="token keyword">as</span> <span class="token keyword">u8</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy)]</span><span class="token attribute attr-name">#[repr(u8)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">TransactionStatus</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Idle</span> <span class="token operator">=</span> <span class="token char">b'I'</span><span class="token punctuation">,</span>    <span class="token class-name">InTransaction</span> <span class="token operator">=</span> <span class="token char">b'T'</span><span class="token punctuation">,</span>    <span class="token class-name">InFailedTransaction</span> <span class="token operator">=</span> <span class="token char">b'E'</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Server sends these parameters:</strong></p><table><thead><tr><th>Parameter</th><th>Value</th><th>Purpose</th></tr></thead><tbody><tr><td><code>server_version</code></td><td><code>16.0</code></td><td>PostgreSQL version we’re emulating</td></tr><tr><td><code>server_encoding</code></td><td><code>UTF8</code></td><td>Character encoding</td></tr><tr><td><code>client_encoding</code></td><td><code>UTF8</code></td><td>Client’s encoding</td></tr><tr><td><code>integer_datetimes</code></td><td><code>on</code></td><td>64-bit integer timestamps</td></tr></tbody></table><hr /><h2 id="3-Simple-Query-Protocol">3 Simple Query Protocol</h2><h3 id="Query-Flow">Query Flow</h3><pre class="language-none"><code class="language-none">Client: Query(&quot;SELECT id, name FROM users WHERE id &#x3D; 1&quot;)Server: RowDescription (column metadata)Server: DataRow (row 1)Server: DataRow (row 2)...Server: CommandComplete (&quot;SELECT 2&quot;)Server: ReadyForQuery (&#39;I&#39;)</code></pre><hr /><h3 id="RowDescription-Telling-Clients-About-Columns">RowDescription: Telling Clients About Columns</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/row_description.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">FieldDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> table_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> column_attr_num<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_size<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_modifier<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> format_code<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span>  <span class="token comment">// 0 = text, 1 = binary</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">RowDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> fields<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FieldDescription</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">RowDescription</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'T' (1B) + Length (4B) + Num Fields (2B) + Fields...</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'T'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Calculate payload length</span>        <span class="token keyword">let</span> payload_len <span class="token operator">=</span> <span class="token number">2</span> <span class="token operator">+</span> <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">*</span> <span class="token number">19</span><span class="token punctuation">)</span> <span class="token operator">+</span>             <span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>f<span class="token closure-punctuation punctuation">|</span></span> f<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sum</span><span class="token punctuation">::</span><span class="token operator">&lt;</span><span class="token keyword">usize</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>payload_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i16</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> field <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>fields <span class="token punctuation">&#123;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>field<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Null terminator</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>table_oid<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>column_attr_num<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_oid<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_size<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>type_modifier<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>field<span class="token punctuation">.</span>format_code<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Example output:</strong></p><pre class="language-none"><code class="language-none">SELECT id, name FROM usersRowDescription:┌─────────────────────────────────────────────────────────────┐│ &#39;T&#39; │ Length │ 2 fields                                     │├─────────────────────────────────────────────────────────────┤│ Field 1: &quot;id&quot;                                               ││   table_oid: 16384                                          ││   column_attr_num: 1                                        ││   type_oid: 23 (INT4)                                       ││   type_size: 4                                              ││   type_modifier: -1                                         ││   format_code: 0 (text)                                     │├─────────────────────────────────────────────────────────────┤│ Field 2: &quot;name&quot;                                             ││   table_oid: 16384                                          ││   column_attr_num: 2                                        ││   type_oid: 25 (TEXT)                                       ││   type_size: -1 (variable)                                  ││   type_modifier: -1                                         ││   format_code: 0 (text)                                     │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="DataRow-Serializing-Actual-Rows">DataRow: Serializing Actual Rows</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/data_row.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">DataRow</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// None = NULL</span>    <span class="token keyword">pub</span> format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">DataRow</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 'D' (1B) + Length (4B) + Num Values (2B) + Values...</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Calculate payload length</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> payload_len <span class="token operator">=</span> <span class="token number">2u32</span><span class="token punctuation">;</span>  <span class="token comment">// Num values</span>        <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            payload_len <span class="token operator">+=</span> <span class="token number">4</span><span class="token punctuation">;</span>  <span class="token comment">// Length prefix</span>            <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=</span> value <span class="token punctuation">&#123;</span>                payload_len <span class="token operator">+=</span> data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>payload_len<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i16</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">for</span> value <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>values <span class="token punctuation">&#123;</span>            <span class="token keyword">match</span> value <span class="token punctuation">&#123;</span>                <span class="token class-name">None</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// NULL: length = -1</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Non-NULL: length + data</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>data<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Example:</strong></p><pre class="language-none"><code class="language-none">Row: id&#x3D;1, name&#x3D;&quot;Alice&quot;, email&#x3D;NULLDataRow:┌─────────────────────────────────────────────────────────────┐│ &#39;D&#39; │ Length │ 3 values                                     │├─────────────────────────────────────────────────────────────┤│ Value 1: 4 bytes │ &quot;1&quot;                                      ││ Value 2: 5 bytes │ &quot;Alice&quot;                                  ││ Value 3: -1 (NULL)                                          │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="Text-vs-Binary-Format">Text vs. Binary Format</h3><p><strong>Text format (format_code = 0):</strong> Human-readable strings</p><pre class="language-none"><code class="language-none">INT4: &quot;42&quot;TEXT: &quot;Alice&quot;TIMESTAMP: &quot;2026-03-29 14:30:00.123456+00&quot;</code></pre><p><strong>Binary format (format_code = 1):</strong> Native representation</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/type_encoding.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_int4</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">into_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Text</span>        <span class="token number">1</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token comment">// Binary</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_text</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>       <span class="token comment">// Text (UTF-8)</span>        <span class="token number">1</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Binary: 4-byte length prefix + data</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> buf <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>value<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            buf        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">encode_timestamp</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span> format<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> format <span class="token punctuation">&#123;</span>        <span class="token number">0</span> <span class="token operator">=></span> value<span class="token punctuation">.</span><span class="token function">format</span><span class="token punctuation">(</span><span class="token string">"%Y-%m-%d %H:%M:%S%.6f%z"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">into_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token number">1</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// PostgreSQL epoch: 2000-01-01 00:00:00 UTC</span>            <span class="token keyword">let</span> epoch <span class="token operator">=</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token punctuation">::</span><span class="token function">from_timestamp</span><span class="token punctuation">(</span><span class="token number">946684800</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> micros <span class="token operator">=</span> value<span class="token punctuation">.</span><span class="token function">signed_duration_since</span><span class="token punctuation">(</span>epoch<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">num_microseconds</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            micros<span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token macro property">panic!</span><span class="token punctuation">(</span><span class="token string">"Invalid format code"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-Extended-Query-Protocol">4 Extended Query Protocol</h2><h3 id="Why-Extended-Query">Why Extended Query?</h3><p><strong>Simple Query:</strong> SQL injection risk, no prepared statements</p><pre class="language-none"><code class="language-none">Client: Query(&quot;SELECT * FROM users WHERE id &#x3D; &quot; + user_input)→ SQL injection vulnerability!</code></pre><p><strong>Extended Query:</strong> Prepared statements, parameter binding</p><pre class="language-none"><code class="language-none">Client: Parse(&quot;SELECT * FROM users WHERE id &#x3D; $1&quot;)Client: Bind([42])Client: Execute()→ Safe from SQL injection!</code></pre><hr /><h3 id="Extended-Query-Flow">Extended Query Flow</h3><pre class="language-MERMAID_BASE64_625" data-language="MERMAID_BASE64_625"><code class="language-MERMAID_BASE64_625">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBDbGllbnQKICAgIHBhcnRpY2lwYW50IFNlcnZlcgoKICAgIENsaWVudC0+PlNlcnZlcjogUGFyc2UgKFNRTCwgcGFyYW1ldGVyIHR5cGVzKQogICAgU2VydmVyLT4+Q2xpZW50OiBQYXJzZUNvbXBsZXRlCgogICAgQ2xpZW50LT4+U2VydmVyOiBCaW5kIChwYXJhbWV0ZXIgdmFsdWVzKQogICAgU2VydmVyLT4+Q2xpZW50OiBCaW5kQ29tcGxldGUKCiAgICBsb29wIE11bHRpcGxlIGV4ZWN1dGlvbnMKICAgICAgICBDbGllbnQtPj5TZXJ2ZXI6IEV4ZWN1dGUgKG1heF9yb3dzKQogICAgICAgIFNlcnZlci0+PkNsaWVudDogRGF0YVJvdyDDlyBOCiAgICBlbmQKCiAgICBDbGllbnQtPj5TZXJ2ZXI6IFN5bmMKICAgIFNlcnZlci0+PkNsaWVudDogQ29tbWFuZENvbXBsZXRlCiAgICBTZXJ2ZXItPj5DbGllbnQ6IFJlYWR5Rm9yUXVlcnk&#x3D;</code></pre><hr /><h3 id="Parse-Preparing-a-Statement">Parse: Preparing a Statement</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/parse.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ParseMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> statement_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> query<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_types<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// OID for each parameter</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ParseMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// statement_name (null-terminated)</span>        <span class="token keyword">let</span> statement_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// query (null-terminated)</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// num_parameter_types (2B)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> num_types_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> num_types_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> num_types <span class="token operator">=</span> <span class="token keyword">i16</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>num_types_buf<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_types (4B each)</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_types <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_types <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> type_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> type_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            parameter_types<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_be_bytes</span><span class="token punctuation">(</span>type_buf<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            statement_name<span class="token punctuation">,</span>            query<span class="token punctuation">,</span>            parameter_types<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">parse_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// '1' (1B) + Length (4B = 4)</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'1'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">4u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Bind-Creating-a-Portal">Bind: Creating a Portal</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/bind.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">BindMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> portal_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> statement_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> parameter_values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> result_format_codes<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">i16</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">BindMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// portal_name (null-terminated)</span>        <span class="token keyword">let</span> portal_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// statement_name (null-terminated)</span>        <span class="token keyword">let</span> statement_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// num_parameter_format_codes (2B)</span>        <span class="token keyword">let</span> num_formats <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_format_codes</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_format_codes <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_formats <span class="token punctuation">&#123;</span>            parameter_format_codes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// num_parameter_values (2B)</span>        <span class="token keyword">let</span> num_values <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// parameter_values</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> parameter_values <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_values <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> len <span class="token operator">=</span> <span class="token function">read_i32</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">if</span> len <span class="token operator">==</span> <span class="token operator">-</span><span class="token number">1</span> <span class="token punctuation">&#123;</span>                parameter_values<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// NULL</span>            <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> <span class="token keyword">mut</span> data <span class="token operator">=</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> len <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">]</span><span class="token punctuation">;</span>                stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> data<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                parameter_values<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// num_result_format_codes (2B)</span>        <span class="token keyword">let</span> num_result_formats <span class="token operator">=</span> <span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// result_format_codes</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> result_format_codes <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> _ <span class="token keyword">in</span> <span class="token number">0</span><span class="token punctuation">..</span>num_result_formats <span class="token punctuation">&#123;</span>            result_format_codes<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token function">read_i16</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            portal_name<span class="token punctuation">,</span>            statement_name<span class="token punctuation">,</span>            parameter_format_codes<span class="token punctuation">,</span>            parameter_values<span class="token punctuation">,</span>            result_format_codes<span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">bind_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// '2' (1B) + Length (4B = 4)</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'2'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token number">4u32</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Execute-Running-the-Prepared-Statement">Execute: Running the Prepared Statement</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/execute.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ExecuteMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> portal_name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> max_rows<span class="token punctuation">:</span> <span class="token keyword">i32</span><span class="token punctuation">,</span>  <span class="token comment">// 0 = all rows</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ExecuteMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> portal_name <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> max_rows <span class="token operator">=</span> <span class="token function">read_i32</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token keyword">Self</span> <span class="token punctuation">&#123;</span> portal_name<span class="token punctuation">,</span> max_rows <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Server response:</strong> DataRow messages (no specific “ExecuteComplete” message)</p><hr /><h3 id="Sync-Finishing-the-Batch">Sync: Finishing the Batch</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/sync.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SyncMessage</span><span class="token punctuation">;</span><span class="token keyword">impl</span> <span class="token class-name">SyncMessage</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span>_stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Sync has no body, just the message header</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">SyncMessage</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Server response</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">sync_complete</span><span class="token punctuation">(</span>builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">,</span> status<span class="token punctuation">:</span> <span class="token class-name">TransactionStatus</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// CommandComplete + ReadyForQuery</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// CommandComplete: 'C' + Length + "SELECT 2\0"</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'C'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> cmd <span class="token operator">=</span> <span class="token string">b"SELECT 2"</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span><span class="token punctuation">(</span>cmd<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_be_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span>cmd<span class="token punctuation">)</span><span class="token punctuation">;</span>    builder<span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-Complete-Query-Execution-Flow">5 Complete Query Execution Flow</h2><h3 id="Putting-It-All-Together">Putting It All Together</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/handler.rs</span><span class="token keyword">use</span> <span class="token namespace">tokio<span class="token punctuation">::</span>net<span class="token punctuation">::</span></span><span class="token class-name">TcpStream</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>query_executor<span class="token punctuation">::</span></span><span class="token class-name">QueryExecutor</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>storage<span class="token punctuation">::</span>buffer_pool<span class="token punctuation">::</span></span><span class="token class-name">BufferPool</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    stream<span class="token punctuation">:</span> <span class="token class-name">TcpStream</span><span class="token punctuation">,</span>    executor<span class="token punctuation">:</span> <span class="token class-name">QueryExecutor</span><span class="token punctuation">,</span>    builder<span class="token punctuation">:</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">,</span>    prepared_statements<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">PreparedStatement</span><span class="token operator">></span><span class="token punctuation">,</span>    portals<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token punctuation">,</span> <span class="token class-name">Portal</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_connection</span><span class="token punctuation">(</span><span class="token keyword">mut</span> stream<span class="token punctuation">:</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Read startup message</span>        <span class="token keyword">let</span> startup <span class="token operator">=</span> <span class="token class-name">StartupMessage</span><span class="token punctuation">::</span><span class="token function">read_from</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 2. Send authentication</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">authentication_ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 3. Send parameter status</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_version"</span><span class="token punctuation">,</span> <span class="token string">"16.0"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"client_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 4. Send ready for query</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// 5. Main message loop</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> type_buf <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">read_exact</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> type_buf<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                        <span class="token keyword">match</span> type_buf<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token keyword">as</span> <span class="token keyword">char</span> <span class="token punctuation">&#123;</span>                <span class="token char">'Q'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_simple_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'P'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_parse</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'B'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_bind</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'E'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'S'</span> <span class="token operator">=></span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">handle_sync</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">,</span>                <span class="token char">'X'</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                    <span class="token comment">// Terminate</span>                    <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                _ <span class="token operator">=></span> <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">::</span><span class="token class-name">UnknownMessage</span><span class="token punctuation">(</span>type_buf<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_simple_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Read query string</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token function">read_null_terminated</span><span class="token punctuation">(</span>stream<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Execute query</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Send RowDescription (if SELECT)</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>columns<span class="token punctuation">)</span> <span class="token operator">=</span> result<span class="token punctuation">.</span>columns <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> row_desc <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_row_description</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>columns<span class="token punctuation">)</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>row_desc<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                        <span class="token comment">// Send DataRows</span>            <span class="token keyword">for</span> row <span class="token keyword">in</span> result<span class="token punctuation">.</span>rows <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> data_row <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_data_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>row<span class="token punctuation">)</span><span class="token punctuation">;</span>                stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>data_row<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// Send CommandComplete</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">command_complete</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>result<span class="token punctuation">.</span>command_tag<span class="token punctuation">)</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Send ReadyForQuery</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Result-Set-Serialization-Example">Result Set Serialization Example</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/result_set.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> columns<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Column</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> rows<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Row</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> command_tag<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Column</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> name<span class="token punctuation">:</span> <span class="token class-name">String</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_oid<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> type_size<span class="token punctuation">:</span> <span class="token keyword">i16</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Row</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">String</span><span class="token operator">>></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">send_to</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> stream<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">TcpStream</span><span class="token punctuation">,</span> builder<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">MessageBuilder</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// RowDescription</span>        <span class="token keyword">let</span> fields<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">FieldDescription</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>col<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>            <span class="token class-name">FieldDescription</span> <span class="token punctuation">&#123;</span>                name<span class="token punctuation">:</span> col<span class="token punctuation">.</span>name<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>                table_oid<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                column_attr_num<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>                type_oid<span class="token punctuation">:</span> col<span class="token punctuation">.</span>type_oid<span class="token punctuation">,</span>                type_size<span class="token punctuation">:</span> col<span class="token punctuation">.</span>type_size<span class="token punctuation">,</span>                type_modifier<span class="token punctuation">:</span> <span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">,</span>                format_code<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Text format</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> row_desc <span class="token operator">=</span> <span class="token class-name">RowDescription</span> <span class="token punctuation">&#123;</span> fields <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>row_desc<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// DataRows</span>        <span class="token keyword">for</span> row <span class="token keyword">in</span> <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>rows <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> values<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token operator">></span> <span class="token operator">=</span> row<span class="token punctuation">.</span>values<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>                <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>v<span class="token closure-punctuation punctuation">|</span></span> v<span class="token punctuation">.</span><span class="token function">as_ref</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>s<span class="token closure-punctuation punctuation">|</span></span> s<span class="token punctuation">.</span><span class="token function">as_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_vec</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>                <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                        <span class="token keyword">let</span> data_row <span class="token operator">=</span> <span class="token class-name">DataRow</span> <span class="token punctuation">&#123;</span>                values<span class="token punctuation">,</span>                format_codes<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>columns<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">]</span><span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>            stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span>data_row<span class="token punctuation">.</span><span class="token function">serialize</span><span class="token punctuation">(</span>builder<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>                <span class="token comment">// CommandComplete</span>        builder<span class="token punctuation">.</span><span class="token function">command_complete</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>command_tag<span class="token punctuation">)</span><span class="token punctuation">;</span>        stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>builder<span class="token punctuation">.</span>buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token comment">// Usage example</span><span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token class-name">ResultSet</span> <span class="token punctuation">&#123;</span>    columns<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span>        <span class="token class-name">Column</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"id"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> type_oid<span class="token punctuation">:</span> <span class="token number">23</span><span class="token punctuation">,</span> type_size<span class="token punctuation">:</span> <span class="token number">4</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token class-name">Column</span> <span class="token punctuation">&#123;</span> name<span class="token punctuation">:</span> <span class="token string">"name"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> type_oid<span class="token punctuation">:</span> <span class="token number">25</span><span class="token punctuation">,</span> type_size<span class="token punctuation">:</span> <span class="token operator">-</span><span class="token number">1</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    rows<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span>        <span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> values<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"1"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"Alice"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">]</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>        <span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> values<span class="token punctuation">:</span> <span class="token macro property">vec!</span><span class="token punctuation">[</span><span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"2"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token string">"Bob"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">]</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token punctuation">]</span><span class="token punctuation">,</span>    command_tag<span class="token punctuation">:</span> <span class="token string">"SELECT 2"</span><span class="token punctuation">.</span><span class="token function">to_string</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span>result<span class="token punctuation">.</span><span class="token function">send_to</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> stream<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> builder<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span></code></pre><p><strong>What psql receives:</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ T (RowDescription)                                          ││   2 columns: id (INT4), name (TEXT)                         │├─────────────────────────────────────────────────────────────┤│ D (DataRow)                                                 ││   id&#x3D;1, name&#x3D;&quot;Alice&quot;                                        │├─────────────────────────────────────────────────────────────┤│ D (DataRow)                                                 ││   id&#x3D;2, name&#x3D;&quot;Bob&quot;                                          │├─────────────────────────────────────────────────────────────┤│ C (CommandComplete)                                         ││   &quot;SELECT 2&quot;                                                │├─────────────────────────────────────────────────────────────┤│ Z (ReadyForQuery)                                           ││   Status: Idle                                              │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="6-PostgreSQL-Type-OIDs">6 PostgreSQL Type OIDs</h2><h3 id="Common-Types">Common Types</h3><table><thead><tr><th>Type Name</th><th>OID</th><th>Size</th><th>Description</th></tr></thead><tbody><tr><td><code>BOOL</code></td><td>16</td><td>1</td><td>Boolean</td></tr><tr><td><code>INT2</code> (SMALLINT)</td><td>21</td><td>2</td><td>2-byte integer</td></tr><tr><td><code>INT4</code> (INTEGER)</td><td>23</td><td>4</td><td>4-byte integer</td></tr><tr><td><code>INT8</code> (BIGINT)</td><td>20</td><td>8</td><td>8-byte integer</td></tr><tr><td><code>TEXT</code></td><td>25</td><td>-1</td><td>Variable-length text</td></tr><tr><td><code>VARCHAR</code></td><td>1043</td><td>-1</td><td>Variable-length char</td></tr><tr><td><code>TIMESTAMP</code></td><td>1114</td><td>8</td><td>Timestamp without timezone</td></tr><tr><td><code>TIMESTAMPTZ</code></td><td>1184</td><td>8</td><td>Timestamp with timezone</td></tr><tr><td><code>FLOAT4</code> (REAL)</td><td>700</td><td>4</td><td>4-byte float</td></tr><tr><td><code>FLOAT8</code> (DOUBLE)</td><td>701</td><td>8</td><td>8-byte float</td></tr><tr><td><code>NUMERIC</code></td><td>1700</td><td>-1</td><td>Arbitrary precision</td></tr><tr><td><code>BYTEA</code></td><td>17</td><td>-1</td><td>Binary data</td></tr><tr><td><code>OID</code></td><td>26</td><td>4</td><td>Object identifier</td></tr></tbody></table><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wire_protocol/oids.rs</span><span class="token keyword">pub</span> <span class="token keyword">mod</span> <span class="token module-declaration namespace">oid</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">BOOL</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">16</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT2</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">21</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT4</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">23</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INT8</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">20</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TEXT</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">25</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">VARCHAR</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1043</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TIMESTAMP</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1114</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">TIMESTAMPTZ</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1184</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FLOAT4</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">700</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FLOAT8</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">701</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">NUMERIC</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">1700</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">BYTEA</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">17</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">OID</span><span class="token punctuation">:</span> <span class="token keyword">u32</span> <span class="token operator">=</span> <span class="token number">26</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="7-Challenges-Building-in-Rust">7 Challenges Building in Rust</h2><h3 id="Challenge-1-Async-I-O-and-Borrowing">Challenge 1: Async I/O and Borrowing</h3><p><strong>Problem:</strong> tokio requires <code>&amp;mut self</code> for async I/O, but we need to borrow from self.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't compile</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Borrows self</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Also borrows self!</span>        <span class="token comment">// Error: cannot borrow as mutable more than once</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Restructure to avoid concurrent borrows</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">impl</span> <span class="token class-name">ProtocolHandler</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token comment">// Release borrow before next operation</span>        <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">send_result</span><span class="token punctuation">(</span>result<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-2-Zero-Copy-vs-Allocation">Challenge 2: Zero-Copy vs. Allocation</h3><p><strong>Problem:</strong> Wire protocol messages need to be serialized. Copying is expensive.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Allocates on every message</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> buffer <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ... lots of allocations ...</span>    buffer<span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Reuse buffers</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Reuses allocated buffer</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Pre-allocated, reused</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">MessageBuilder</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">with_capacity</span><span class="token punctuation">(</span>capacity<span class="token punctuation">:</span> <span class="token keyword">usize</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            buffer<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">with_capacity</span><span class="token punctuation">(</span>capacity<span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>        <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">data_row</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> row<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Row</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">clear</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Reuse capacity</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer<span class="token punctuation">.</span><span class="token function">push</span><span class="token punctuation">(</span><span class="token char">b'D'</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// ... write to buffer ...</span>        <span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">.</span>buffer  <span class="token comment">// Return reference, not owned</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-3-Error-Handling-Across-Layers">Challenge 3: Error Handling Across Layers</h3><p><strong>Problem:</strong> Wire protocol errors, query errors, storage errors—all different types.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Error type explosion</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">Error</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Io</span><span class="token punctuation">(</span><span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Protocol</span><span class="token punctuation">(</span><span class="token class-name">ProtocolError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Query</span><span class="token punctuation">(</span><span class="token class-name">QueryError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Storage</span><span class="token punctuation">(</span><span class="token class-name">StorageError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token class-name">Transaction</span><span class="token punctuation">(</span><span class="token class-name">TransactionError</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token comment">// ... 20 more variants ...</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Use <code>thiserror</code> and conversion traits</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Clean error handling</span><span class="token attribute attr-name">#[derive(Debug, thiserror::Error)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">ProtocolError</span> <span class="token punctuation">&#123;</span>    <span class="token attribute attr-name">#[error(<span class="token string">"IO error: &#123;0&#125;"</span>)]</span>    <span class="token class-name">Io</span><span class="token punctuation">(</span><span class="token attribute attr-name">#[from]</span> <span class="token namespace">io<span class="token punctuation">::</span></span><span class="token class-name">Error</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token attribute attr-name">#[error(<span class="token string">"Invalid message type: &#123;0&#125;"</span>)]</span>    <span class="token class-name">UnknownMessage</span><span class="token punctuation">(</span><span class="token keyword">u8</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token attribute attr-name">#[error(<span class="token string">"Query error: &#123;0&#125;"</span>)]</span>    <span class="token class-name">Query</span><span class="token punctuation">(</span><span class="token attribute attr-name">#[from]</span> <span class="token class-name">QueryError</span><span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token comment">// Use ? operator for automatic conversion</span><span class="token keyword">pub</span> <span class="token keyword">async</span> <span class="token keyword">fn</span> <span class="token function-definition function">handle_query</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">ProtocolError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> query <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_query</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// io::Error → ProtocolError</span>    <span class="token keyword">let</span> result <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>executor<span class="token punctuation">.</span><span class="token function">execute</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>query<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// QueryError → ProtocolError</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-How-AI-Accelerated-This">8 How AI Accelerated This</h2><h3 id="What-AI-Got-Right">What AI Got Right</h3><table><thead><tr><th>Task</th><th>AI Contribution</th></tr></thead><tbody><tr><td><strong>Message format</strong></td><td>Correct big-endian encoding</td></tr><tr><td><strong>Extended query flow</strong></td><td>Parse → Bind → Execute sequence</td></tr><tr><td><strong>Type OIDs</strong></td><td>Accurate PostgreSQL type OIDs</td></tr><tr><td><strong>Null handling</strong></td><td>-1 length prefix for NULLs</td></tr></tbody></table><hr /><h3 id="What-AI-Got-Wrong">What AI Got Wrong</h3><table><thead><tr><th>Issue</th><th>What Happened</th></tr></thead><tbody><tr><td><strong>Length calculation</strong></td><td>First draft didn’t include length bytes in length</td></tr><tr><td><strong>Startup message</strong></td><td>Tried to add type byte (startup has none!)</td></tr><tr><td><strong>Binary format</strong></td><td>Suggested little-endian (PostgreSQL uses big-endian)</td></tr><tr><td><strong>Portal lifetime</strong></td><td>Missed that portals are destroyed after Execute</td></tr></tbody></table><p><strong>Pattern:</strong> Wire protocol is precise. Off-by-one errors break everything.</p><hr /><h3 id="Example-Debugging-psql-Connection">Example: Debugging psql Connection</h3><p><strong>My question to AI:</strong></p><blockquote><p>“psql connects but immediately disconnects. What’s wrong?”</p></blockquote><p><strong>What I learned:</strong></p><ol><li>psql expects specific ParameterStatus messages</li><li>Missing <code>server_version</code> causes silent disconnect</li><li>ReadyForQuery must be sent after authentication</li></ol><p><strong>Result:</strong> Added required parameters:</p><pre class="language-rust" data-language="rust"><code class="language-rust">stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_version"</span><span class="token punctuation">,</span> <span class="token string">"16.0"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"server_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">parameter_status</span><span class="token punctuation">(</span><span class="token string">"client_encoding"</span><span class="token punctuation">,</span> <span class="token string">"UTF8"</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span>stream<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>builder<span class="token punctuation">.</span><span class="token function">ready_for_query</span><span class="token punctuation">(</span><span class="token class-name">TransactionStatus</span><span class="token punctuation">::</span><span class="token class-name">Idle</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token keyword">await</span><span class="token operator">?</span><span class="token punctuation">;</span></code></pre><p><strong>Now psql connects successfully!</strong></p><pre class="language-none"><code class="language-none">$ psql -h localhost -p 5432 -U neo vaultgrespsql (16.0, server 16.0 (Vaultgres))Type &quot;help&quot; for help.vaultgres&#x3D;&gt; SELECT 1; ?column? ----------        1(1 row)</code></pre><hr /><h2 id="Summary-Wire-Protocol-in-One-Diagram">Summary: Wire Protocol in One Diagram</h2><pre class="language-MERMAID_BASE64_626" data-language="MERMAID_BASE64_626"><code class="language-MERMAID_BASE64_626">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiQ29ubmVjdGlvbiBTdGFydHVwIgogICAgICAgIEFbQ2xpZW50IGNvbm5lY3RzXSAtLT4gQltTdGFydHVwTWVzc2FnZV0KICAgICAgICBCIC0tPiBDW0F1dGhlbnRpY2F0aW9uT2tdCiAgICAgICAgQyAtLT4gRFtQYXJhbWV0ZXJTdGF0dXNdCiAgICAgICAgRCAtLT4gRVtSZWFkeUZvclF1ZXJ5XQogICAgZW5kCgogICAgc3ViZ3JhcGggIlNpbXBsZSBRdWVyeSIKICAgICAgICBGW1F1ZXJ5ICdRJ10gLS0+IEdbUm93RGVzY3JpcHRpb24gJ1QnXQogICAgICAgIEcgLS0+IEhbRGF0YVJvdyAnRCcgw5cgTl0KICAgICAgICBIIC0tPiBJW0NvbW1hbmRDb21wbGV0ZSAnQyddCiAgICAgICAgSSAtLT4gRQogICAgZW5kCgogICAgc3ViZ3JhcGggIkV4dGVuZGVkIFF1ZXJ5IgogICAgICAgIEpbUGFyc2UgJ1AnXSAtLT4gS1tQYXJzZUNvbXBsZXRlICcxJ10KICAgICAgICBLIC0tPiBMW0JpbmQgJ0InXQogICAgICAgIEwgLS0+IE1bQmluZENvbXBsZXRlICcyJ10KICAgICAgICBNIC0tPiBOW0V4ZWN1dGUgJ0UnXQogICAgICAgIE4gLS0+IEgKICAgICAgICBPW1N5bmMgJ1MnXSAtLT4gSQogICAgZW5kCgogICAgc3ViZ3JhcGggIk1lc3NhZ2UgRm9ybWF0IgogICAgICAgIFBbVHlwZSAxQl0gLS0+IFFbTGVuZ3RoIDRCIEJFXQogICAgICAgIFEgLS0+IFJbUGF5bG9hZF0KICAgIGVuZAoKICAgIHN1YmdyYXBoICJSZXN1bHQgU2VyaWFsaXphdGlvbiIKICAgICAgICBTW1JvdzogaWQ9MSwgbmFtZT0nQWxpY2UnXSAtLT4gVFtEYXRhUm93OiAnRCcgKyBsZW4gKyB2YWx1ZXNdCiAgICBlbmQKCiAgICBzdHlsZSBCIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgRiBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEogZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNjCiAgICBzdHlsZSBQIGZpbGw6I2ZmZjNlMCxzdHJva2U6I2Y1N2MwMA&#x3D;&#x3D;</code></pre><p><strong>Key Takeaways:</strong></p><table><thead><tr><th>Concept</th><th>Why It Matters</th></tr></thead><tbody><tr><td><strong>Wire protocol</strong></td><td>Compatibility with existing PostgreSQL tools</td></tr><tr><td><strong>Message framing</strong></td><td>Length-prefixed binary protocol</td></tr><tr><td><strong>Simple vs. Extended</strong></td><td>Quick queries vs. prepared statements</td></tr><tr><td><strong>RowDescription</strong></td><td>Column metadata for clients</td></tr><tr><td><strong>DataRow</strong></td><td>Actual row data (text or binary)</td></tr><tr><td><strong>Type OIDs</strong></td><td>PostgreSQL type identification</td></tr><tr><td><strong>Null encoding</strong></td><td>-1 length prefix</td></tr></tbody></table><hr /><p><strong>Further Reading:</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c"><code>src/backend/tcop/postgres.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/include/libpq/pqformat.h"><code>src/include/libpq/pqformat.h</code></a></li><li>“PostgreSQL Wire Protocol” documentation: <a href="https://www.postgresql.org/docs/current/protocol.html">https://www.postgresql.org/docs/current/protocol.html</a></li><li>libpq source: <a href="https://github.com/postgres/postgres/tree/master/src/interfaces/libpq"><code>src/interfaces/libpq/</code></a></li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Part 5 of the Vaultgres journey: implementing the PostgreSQL wire protocol. Deep dive into message framing, startup handshake, extended query protocol, and serializing result sets that psql and drivers can understand.</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 构建 PostgreSQL 兼容数据库：WAL 与 ARIES 崩溃恢复</title>
    <link href="https://neo01.com/zh-CN/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/"/>
    <id>https://neo01.com/zh-CN/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/</id>
    <published>2026-03-03T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:53.122Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-CN/2026/03/Database-Rust-MVCC-Transaction-Manager/">第三部分</a> 中，我们构建了用于并发事务的 MVCC。但有一个可怕的问题我们还没有回答。</p><p><strong>停电时会发生什么？</strong></p><pre class="language-none"><code class="language-none">Transaction: UPDATE accounts SET balance &#x3D; 1000 WHERE id &#x3D; 11. Read page into buffer pool2. Modify page in memory (balance &#x3D; 1000)3. Mark page as dirty4. ACK to client: &quot;Done!&quot;5. ⚡ POWER FAILURE ⚡6. Dirty page never written to disk7. Client&#39;s money: GONE 💸</code></pre><p>这就是数据库使用 <strong>WAL：预写日志</strong> 的原因。</p><p>今天：在 Rust 中实现 WAL 和 ARIES 恢复算法。这就是确保你的数据在崩溃、停电和核心恐慌中存活的代码。</p><hr /><h2 id="1-WAL-原则">1 WAL 原则</h2><h3 id="基本规则">基本规则</h3><p><strong>预写日志：</strong> 在修改任何数据页面前，你<strong>必须</strong>将变更写入 WAL。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ WRONG - data modification before WAL</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> new_data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Modified in memory!</span>    page<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// WAL comes too late</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Too late!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ CORRECT - WAL first</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> new_data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Generate LSN (Log Sequence Number)</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Flush WAL to disk (fsync)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. NOW safe to modify page</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Track which LSN modified this page</span>    page<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>为什么这样有效：</strong></p><pre class="language-none"><code class="language-none">Crash at different points:After WAL write, before page modify:→ Recovery replays WAL, data is restored ✓After page modify, before flush:→ WAL on disk, recovery ensures durability ✓After flush to disk:→ Data is durable ✓</code></pre><hr /><h3 id="日志序列号-LSN">日志序列号 (LSN)</h3><p>每个 WAL 记录获得一个独特的、单调递增的标识符：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/lsn.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Lsn</span><span class="token punctuation">(</span><span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">impl</span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INVALID</span><span class="token punctuation">:</span> <span class="token class-name">Lsn</span> <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">invalid</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">==</span> <span class="token number">0</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// LSN encoding: segment_id (high 32) + offset (low 32)</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">from_segment_offset</span><span class="token punctuation">(</span>segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span> offset<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">(</span>segment <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token operator">&lt;&lt;</span> <span class="token number">32</span><span class="token punctuation">)</span> <span class="token operator">|</span> <span class="token punctuation">(</span>offset <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u32</span> <span class="token punctuation">&#123;</span>        <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">>></span> <span class="token number">32</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">offset</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u32</span> <span class="token punctuation">&#123;</span>        <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">&amp;</span> <span class="token number">0xFFFFFFFF</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>LSN 顺序保证：</strong></p><pre class="language-none"><code class="language-none">LSN 100: BEGIN txn 1LSN 101: INSERT row A (txn 1)LSN 102: INSERT row B (txn 1)LSN 103: COMMIT txn 1LSN 104: BEGIN txn 2LSN 105: UPDATE row A (txn 2)...LSN 100 &lt; LSN 101 &lt; LSN 102 &lt; ...  (strictly increasing)</code></pre><hr /><h2 id="2-WAL-记录格式">2 WAL 记录格式</h2><h3 id="记录结构">记录结构</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/record.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>storage<span class="token punctuation">::</span></span><span class="token class-name">PageId</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>transaction<span class="token punctuation">::</span></span><span class="token class-name">TransactionId</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalRecord</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> prev_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>  <span class="token comment">// Link to previous record from same transaction</span>    <span class="token keyword">pub</span> transaction_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> record_type<span class="token punctuation">:</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token class-name">PageId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data<span class="token punctuation">:</span> <span class="token class-name">WalData</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> checksum<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">WalRecordType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Begin</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span> <span class="token punctuation">&#123;</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span> after_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span> <span class="token punctuation">&#123;</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Commit</span><span class="token punctuation">,</span>    <span class="token class-name">Abort</span><span class="token punctuation">,</span>    <span class="token class-name">CheckpointBegin</span><span class="token punctuation">,</span>    <span class="token class-name">CheckpointEnd</span><span class="token punctuation">,</span>    <span class="token class-name">Compensation</span> <span class="token punctuation">&#123;</span> undo_next_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>  <span class="token comment">// For aborts</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">WalData</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">PageImage</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>      <span class="token comment">// Full page image (for checkpoints)</span>    <span class="token class-name">RowData</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Row-level change</span>    <span class="token class-name">IndexEntry</span> <span class="token punctuation">&#123;</span> key<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>磁盘上的实体布局：</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ WAL Segment File (e.g., 000000010000000000000001)           │├─────────────────────────────────────────────────────────────┤│ PageHeader (16 bytes)                                       │├─────────────────────────────────────────────────────────────┤│ Record 1:                                                   ││   ├─ Length (4 bytes)                                       ││   ├─ LSN (8 bytes)                                          ││   ├─ Prev LSN (8 bytes)                                     ││   ├─ Transaction ID (4 bytes)                               ││   ├─ Record Type (1 byte)                                   ││   ├─ Page ID (8 bytes)                                      ││   ├─ Offset (2 bytes)                                       ││   ├─ Data Length (4 bytes)                                  ││   ├─ Data (variable)                                        ││   └─ Checksum (4 bytes)                                     │├─────────────────────────────────────────────────────────────┤│ Record 2:                                                   ││   ...                                                       │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="实体-vs-逻辑-WAL">实体 vs. 逻辑 WAL</h3><p><strong>PostgreSQL 使用实体 WAL</strong>（页面级别变更）：</p><table><thead><tr><th>类型</th><th>记录内容</th><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td><strong>实体</strong></td><td>页面上的字节范围</td><td>简单的 replay，精确的变更</td><td>较大的日志，依赖页面格式</td></tr><tr><td><strong>逻辑</strong></td><td>SQL 操作 (INSERT/UPDATE)</td><td>较小的日志，格式独立</td><td>复杂的 replay，必须重新执行</td></tr></tbody></table><p><strong>Vaultgres 方法：</strong> 实体 WAL 以简化（匹配 PostgreSQL）：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">PhysicalWalRecord</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token class-name">PageId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> length<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token punctuation">,</span>  <span class="token comment">// For undo</span>    <span class="token keyword">pub</span> after_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>            <span class="token comment">// For redo</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="3-WAL-管理器实现">3 WAL 管理器实现</h2><h3 id="写入-WAL-记录">写入 WAL 记录</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/manager.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">File</span><span class="token punctuation">,</span> <span class="token class-name">OpenOptions</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>io<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">Write</span><span class="token punctuation">,</span> <span class="token class-name">Seek</span><span class="token punctuation">,</span> <span class="token class-name">SeekFrom</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>path<span class="token punctuation">::</span></span><span class="token class-name">PathBuf</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">parking_lot<span class="token punctuation">::</span></span><span class="token class-name">Mutex</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalManager</span> <span class="token punctuation">&#123;</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    current_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    current_file<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token class-name">File</span><span class="token operator">></span><span class="token punctuation">,</span>    current_offset<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">,</span>    flush_lsn<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token operator">></span><span class="token punctuation">,</span>    last_lsn<span class="token punctuation">:</span> <span class="token class-name">AtomicU64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token function">create_dir_all</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> manager <span class="token operator">=</span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">::</span><span class="token function">from</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">)</span><span class="token punctuation">,</span>            current_segment<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            current_file<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">create</span><span class="token punctuation">(</span><span class="token string">""</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Placeholder</span>            current_offset<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            flush_lsn<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token constant">INVALID</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            last_lsn<span class="token punctuation">:</span> <span class="token class-name">AtomicU64</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        manager<span class="token punctuation">.</span><span class="token function">open_or_create_segment</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>manager<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">open_or_create_segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> file <span class="token operator">=</span> <span class="token class-name">OpenOptions</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">create</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span><span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=</span> file<span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment <span class="token operator">=</span> segment<span class="token punctuation">;</span>        <span class="token operator">*</span><span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Assign new LSN</span>        <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>last_lsn<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Serialize record</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buffer <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">serialize_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> lsn<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Check if we need a new segment</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> offset <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> <span class="token operator">*</span>offset <span class="token operator">+</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span> <span class="token operator">></span> <span class="token constant">SEGMENT_SIZE</span> <span class="token punctuation">&#123;</span>            <span class="token function">drop</span><span class="token punctuation">(</span>offset<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">rotate_segment</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            offset <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Write to file</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> file <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        file<span class="token punctuation">.</span><span class="token function">seek</span><span class="token punctuation">(</span><span class="token class-name">SeekFrom</span><span class="token punctuation">::</span><span class="token class-name">Start</span><span class="token punctuation">(</span><span class="token operator">*</span>offset <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        file<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span>offset <span class="token operator">+=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">flush</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> target_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> flush_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>flush_lsn<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Already flushed?</span>        <span class="token keyword">if</span> <span class="token operator">*</span>flush_lsn <span class="token operator">>=</span> target_lsn <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Sync to disk</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sync_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span>flush_lsn <span class="token operator">=</span> target_lsn<span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="读取-WAL-记录">读取 WAL 记录</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> start_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalIterator</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> segment <span class="token operator">=</span> start_lsn<span class="token punctuation">.</span><span class="token function">segment</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> start_lsn<span class="token punctuation">.</span><span class="token function">offset</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> file <span class="token operator">=</span> <span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">WalIterator</span> <span class="token punctuation">&#123;</span>            current_file<span class="token punctuation">:</span> file<span class="token punctuation">,</span>            current_segment<span class="token punctuation">:</span> segment<span class="token punctuation">,</span>            current_offset<span class="token punctuation">:</span> offset<span class="token punctuation">,</span>            wal_dir<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalIterator</span> <span class="token punctuation">&#123;</span>    current_file<span class="token punctuation">:</span> <span class="token class-name">File</span><span class="token punctuation">,</span>    current_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    current_offset<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Iterator</span> <span class="token keyword">for</span> <span class="token class-name">WalIterator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">type</span> <span class="token type-definition class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span><span class="token punctuation">;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token class-name">Item</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Try to read record at current position</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_record</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Ok</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// End of segment, try next segment</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_file <span class="token operator">=</span> <span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_record</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>r<span class="token closure-punctuation punctuation">|</span></span> r<span class="token punctuation">.</span><span class="token function">ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">flatten</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Err</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-检查点：限制恢复时间">4 检查点：限制恢复时间</h2><h3 id="问题：无边界的-replay">问题：无边界的 replay</h3><p><strong>没有检查点：</strong></p><pre class="language-none"><code class="language-none">Day 1: Database createdDay 30: Crash!Recovery: Replay 30 days of WAL records 😱</code></pre><p><strong>解决方案：</strong> 定期检查点。</p><hr /><h3 id="检查点流程">检查点流程</h3><pre class="language-MERMAID_BASE64_612" data-language="MERMAID_BASE64_612"><code class="language-MERMAID_BASE64_612">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBEQiBhcyBEYXRhYmFzZQogICAgcGFydGljaXBhbnQgV0FMIGFzIFdBTCBNYW5hZ2VyCiAgICBwYXJ0aWNpcGFudCBCUCBhcyBCdWZmZXIgUG9vbAogICAgcGFydGljaXBhbnQgQ0tQVCBhcyBDaGVja3BvaW50IEZpbGUKCiAgICBEQi0+PkRCOiAxLiBXcml0ZSBDSEVDS1BPSU5UX0JFR0lOIHJlY29yZAogICAgREItPj5CUDogMi4gRmx1c2ggYWxsIGRpcnR5IHBhZ2VzCiAgICBCUC0tPj5EQjogUGFnZXMgd3JpdHRlbiB0byBkaXNrCiAgICBEQi0+PkRCOiAzLiBXcml0ZSBDSEVDS1BPSU5UX0VORCByZWNvcmQKICAgIERCLT4+V0FMOiA0LiBGbHVzaCBXQUwgdG8gZGlzawogICAgREItPj5DS1BUOiA1LiBTYXZlIGNoZWNrcG9pbnQgTFNOCiAgICBDS1BULS0+PkRCOiBDaGVja3BvaW50IGNvbXBsZXRl</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/checkpoint.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> checkpoint_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> active_transactions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> dirty_pages<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// page_id → page_lsn</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">create_checkpoint</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Log checkpoint begin</span>        <span class="token keyword">let</span> begin_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. Flush all dirty pages (this is the expensive part!)</span>        buffer_pool<span class="token punctuation">.</span><span class="token function">flush_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 3. Get current state</span>        <span class="token keyword">let</span> checkpoint <span class="token operator">=</span> <span class="token class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>            checkpoint_lsn<span class="token punctuation">:</span> begin_lsn<span class="token punctuation">,</span>            active_transactions<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_active_transactions</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            dirty_pages<span class="token punctuation">:</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_dirty_pages</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// 4. Log checkpoint end</span>        <span class="token keyword">let</span> end_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_end</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 5. Flush WAL</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 6. Save checkpoint to known location</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">save_checkpoint_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Fuzzy-检查点-PostgreSQL-风格">Fuzzy 检查点 (PostgreSQL 风格)</h3><p><strong>Sharp 检查点：</strong> 检查点期间阻塞所有写入。简单但会造成停顿。</p><p><strong>Fuzzy 检查点：</strong> 检查点期间允许写入。复杂但无停顿。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Fuzzy checkpoint approach</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">create_fuzzy_checkpoint</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Record checkpoint start LSN</span>    <span class="token keyword">let</span> start_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Write CHECKPOINT_BEGIN (don't block)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Get list of dirty pages (snapshot)</span>    <span class="token keyword">let</span> dirty_pages <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_dirty_pages_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 4. Flush dirty pages in background (don't block new writes)</span>    <span class="token keyword">let</span> checkpoint_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>page_id<span class="token punctuation">,</span> page_lsn<span class="token punctuation">)</span> <span class="token keyword">in</span> dirty_pages <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> checkpoint_lsn <span class="token punctuation">&#123;</span>            buffer_pool<span class="token punctuation">.</span><span class="token function">flush_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// 5. Write CHECKPOINT_END with final LSN</span>    <span class="token keyword">let</span> end_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_end_at</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>        checkpoint_lsn<span class="token punctuation">:</span> end_lsn<span class="token punctuation">,</span>        <span class="token comment">// ...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-ARIES：恢复算法">5 ARIES：恢复算法</h2><h3 id="三个阶段">三个阶段</h3><p>ARIES = <strong>A</strong>lgorithm for <strong>R</strong>ecovery and <strong>I</strong>solation <strong>E</strong>xploiting <strong>S</strong>emantics</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    ARIES Recovery                           │├─────────────────────────────────────────────────────────────┤│ Phase 1: ANALYSIS                                           ││ - Scan WAL from last checkpoint                             ││ - Determine which transactions were active at crash         ││ - Build dirty page table                                    │├─────────────────────────────────────────────────────────────┤│ Phase 2: REDO                                               ││ - Replay ALL logged changes from analysis end               ││ - Bring database to exact crash state                       ││ - Skip pages already on disk (using page LSN)               │├─────────────────────────────────────────────────────────────┤│ Phase 3: UNDO                                               ││ - Roll back all uncommitted transactions                    ││ - Write Compensation Log Records (CLRs)                     ││ - Database is now consistent                                │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="第一阶段：Analysis">第一阶段：Analysis</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/analysis.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AnalysisResult</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> transactions_at_crash<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> dirty_page_table<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// page → first LSN that dirtied it</span>    <span class="token keyword">pub</span> redo_start_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze</span><span class="token punctuation">(</span>wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalManager</span><span class="token punctuation">,</span> checkpoint<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> txn_status<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> dirty_page_table<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Initialize from checkpoint</span>    <span class="token keyword">for</span> txn <span class="token keyword">in</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">.</span>active_transactions <span class="token punctuation">&#123;</span>        txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span><span class="token operator">*</span>txn<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>page_id<span class="token punctuation">,</span> page_lsn<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">.</span>dirty_pages <span class="token punctuation">&#123;</span>        dirty_page_table<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span><span class="token operator">*</span>page_id<span class="token punctuation">,</span> <span class="token operator">*</span>page_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Scan WAL from checkpoint</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> iterator <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">read_from</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record_result <span class="token keyword">in</span> iterator <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> record <span class="token operator">=</span> record_result<span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">match</span> record<span class="token punctuation">.</span>record_type <span class="token punctuation">&#123;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Begin</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Commit</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Committed</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Abort</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Aborted</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Track first LSN that dirtied each page</span>                dirty_page_table<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Find minimum redo LSN</span>    <span class="token keyword">let</span> redo_start_lsn <span class="token operator">=</span> dirty_page_table<span class="token punctuation">.</span><span class="token function">values</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">copied</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">AnalysisResult</span> <span class="token punctuation">&#123;</span>        transactions_at_crash<span class="token punctuation">:</span> txn_status<span class="token punctuation">,</span>        dirty_page_table<span class="token punctuation">,</span>        redo_start_lsn<span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="第二阶段：Redo">第二阶段：Redo</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/redo.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">redo</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    analysis<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> iterator <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">read_from</span><span class="token punctuation">(</span>analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record_result <span class="token keyword">in</span> iterator <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> record <span class="token operator">=</span> record_result<span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Only redo committed transactions' changes</span>        <span class="token comment">// (We redo ALL changes first, undo uncommitted later)</span>        <span class="token keyword">match</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>record_type <span class="token punctuation">&#123;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Check if page needs redo</span>                <span class="token keyword">let</span> page <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> page_lsn <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Only apply if page is older than this LSN</span>                <span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> record<span class="token punctuation">.</span>lsn <span class="token punctuation">&#123;</span>                    <span class="token function">apply_redo</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> <span class="token operator">&amp;</span>page<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                    page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Else: page already has this change (written before crash)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">apply_redo</span><span class="token punctuation">(</span>record<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> page<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Page</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>data <span class="token punctuation">&#123;</span>        <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">PageImage</span><span class="token punctuation">(</span>image<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Full page image (from checkpoint)</span>            page<span class="token punctuation">.</span><span class="token function">copy_from_slice</span><span class="token punctuation">(</span>image<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">RowData</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Partial page update</span>            page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> data<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>关键洞察：</strong> Redo 是<strong>幂等的</strong>。多次执行产生相同的结果。</p><hr /><h3 id="第三阶段：Undo">第三阶段：Undo</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/undo.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">undo</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    analysis<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Find transactions to undo (active at crash, not committed)</span>    <span class="token keyword">let</span> losers<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span> <span class="token operator">=</span> analysis<span class="token punctuation">.</span>transactions_at_crash        <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> status<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span><span class="token operator">*</span>status <span class="token operator">==</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>txn<span class="token punctuation">,</span> _<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span>txn<span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Undo in reverse order (LIFO - Last Committed, First Undone)</span>    <span class="token keyword">for</span> txn_id <span class="token keyword">in</span> losers<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">rev</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token function">undo_transaction</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> txn_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">undo_transaction</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Find all records for this transaction (in reverse)</span>    <span class="token keyword">let</span> records <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">get_transaction_records</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record <span class="token keyword">in</span> records<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">rev</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Write Compensation Log Record (CLR)</span>        <span class="token keyword">let</span> clr <span class="token operator">=</span> <span class="token class-name">WalRecord</span> <span class="token punctuation">&#123;</span>            lsn<span class="token punctuation">:</span> wal<span class="token punctuation">.</span><span class="token function">next_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            prev_lsn<span class="token punctuation">:</span> record<span class="token punctuation">.</span>lsn<span class="token punctuation">,</span>            transaction_id<span class="token punctuation">:</span> txn_id<span class="token punctuation">,</span>            record_type<span class="token punctuation">:</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Compensation</span> <span class="token punctuation">&#123;</span>                undo_next_lsn<span class="token punctuation">:</span> record<span class="token punctuation">.</span>prev_lsn<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            page_id<span class="token punctuation">:</span> record<span class="token punctuation">.</span>page_id<span class="token punctuation">,</span>            offset<span class="token punctuation">:</span> record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span>            data<span class="token punctuation">:</span> <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">RowData</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>before_image<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            checksum<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Calculate checksum</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> clr_lsn <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>clr<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Apply undo (restore before_image)</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>before_image<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>before_image <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> page <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> before_image<span class="token punctuation">)</span><span class="token punctuation">;</span>            page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>clr_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Log transaction abort</span>    wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">abort</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    wal<span class="token punctuation">.</span><span class="token function">flush_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="完整的恢复流程">完整的恢复流程</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/mod.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">recover</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    data_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">RecoveryStats</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Find last checkpoint</span>    <span class="token keyword">let</span> checkpoint <span class="token operator">=</span> <span class="token function">find_last_checkpoint</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> data_dir<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Starting recovery from checkpoint LSN &#123;&#125;"</span><span class="token punctuation">,</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Phase 1: Analysis</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 1: Analysis..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> analysis <span class="token operator">=</span> <span class="token function">analyze</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Found &#123;&#125; active transactions at crash"</span><span class="token punctuation">,</span>             analysis<span class="token punctuation">.</span>transactions_at_crash<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Redo will start from LSN &#123;&#125;"</span><span class="token punctuation">,</span> analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Phase 2: Redo</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 2: Redo..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token function">redo</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> <span class="token operator">&amp;</span>analysis<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Redo complete"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 4. Phase 3: Undo</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 3: Undo..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token function">undo</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> <span class="token operator">&amp;</span>analysis<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Undo complete"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 5. Truncate old WAL (optional)</span>    <span class="token function">truncate_wal_before</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">RecoveryStats</span> <span class="token punctuation">&#123;</span>        checkpoint_lsn<span class="token punctuation">:</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">,</span>        redo_start_lsn<span class="token punctuation">:</span> analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">,</span>        transactions_aborted<span class="token punctuation">:</span> analysis<span class="token punctuation">.</span>transactions_at_crash            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> s<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span><span class="token operator">*</span>s <span class="token operator">==</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-恢复示例：逐步说明">6 恢复示例：逐步说明</h2><h3 id="崩溃情景">崩溃情景</h3><pre class="language-none"><code class="language-none">Time    LSN    Transaction    Action─────────────────────────────────────────────────────10:00   100    CKPT           Checkpoint created10:01   101    T1 (xid&#x3D;1)     BEGIN10:02   102    T1             INSERT row A (balance&#x3D;100)10:03   103    T2 (xid&#x3D;2)     BEGIN10:04   104    T2             INSERT row B (balance&#x3D;200)10:05   105    T1             COMMIT10:06   106    T2             UPDATE row B (balance&#x3D;250)10:07   ─────  ⚡ CRASH ⚡</code></pre><p><strong>崩溃时的状态：</strong></p><ul><li>T1: Committed (LSN 105)</li><li>T2: Active (not committed)</li><li>Dirty pages: A (LSN 102), B (LSN 106)</li></ul><hr /><h3 id="恢复执行">恢复执行</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ Phase 1: ANALYSIS                                           │├─────────────────────────────────────────────────────────────┤│ Start from checkpoint LSN 100                               ││ Scan records 100-106                                        ││                                                             ││ Result:                                                     ││   - T1: Committed (LSN 105)                                 ││   - T2: Active (loser!)                                     ││   - Dirty pages: A→102, B→106                               ││   - Redo start: LSN 102                                     │└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐│ Phase 2: REDO                                               │├─────────────────────────────────────────────────────────────┤│ Replay from LSN 102:                                        ││                                                             ││ LSN 102: INSERT row A                                       ││   → Check page A LSN                                        ││   → If page LSN &lt; 102: apply insert                         ││   → Else: skip (already on disk)                            ││                                                             ││ LSN 104: INSERT row B                                       ││   → Apply if needed                                         ││                                                             ││ LSN 106: UPDATE row B                                       ││   → Apply if needed                                         ││                                                             ││ Result: Database at exact crash state                       │└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐│ Phase 3: UNDO                                               │├─────────────────────────────────────────────────────────────┤│ Loser transactions: T2                                      ││                                                             ││ Undo T2 (in reverse order):                                 ││   1. Undo LSN 106 (UPDATE B: 200→250)                       ││      → Write CLR: undo_next_lsn &#x3D; 104                       ││      → Restore B to balance&#x3D;200                             ││                                                             ││   2. Undo LSN 104 (INSERT B)                                ││      → Write CLR: undo_next_lsn &#x3D; 103                       ││      → Delete row B                                         ││                                                             ││   3. Log T2 ABORT                                           ││                                                             ││ Result: T2&#39;s changes rolled back, T1&#39;s changes preserved    │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="7-WAL-归档和时间点恢复">7 WAL 归档和时间点恢复</h2><h3 id="WAL-归档">WAL 归档</h3><p><strong>连续归档：</strong> 在重用前将 WAL 段复制到安全存储。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/archiver.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalArchiver</span> <span class="token punctuation">&#123;</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    archive_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    archive_timeout<span class="token punctuation">:</span> <span class="token class-name">Duration</span><span class="token punctuation">,</span>    last_archived_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalArchiver</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">archive_ready_segments</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Find completed segments (can't be overwritten)</span>        <span class="token keyword">for</span> segment <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_ready_segments</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> src <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> dst <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>archive_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;.backup"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// Copy to archive (could be remote storage like S3)</span>            <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token function">copy</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>src<span class="token punctuation">,</span> <span class="token operator">&amp;</span>dst<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>last_archived_segment <span class="token operator">=</span> segment<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="时间点恢复-PITR">时间点恢复 (PITR)</h3><pre class="language-none"><code class="language-none">Goal: Restore database to state at 2026-03-22 14:30:001. Restore base backup from 2026-03-22 00:00:002. Replay WAL segments from archive3. Stop replay at target time (14:30:00)4. Database restored to exact point in time</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/pitr.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">recover_to_point_in_time</span><span class="token punctuation">(</span>    base_backup<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>    archive_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>    target_time<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Restore base backup</span>    <span class="token function">restore_base_backup</span><span class="token punctuation">(</span>base_backup<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Find WAL segments to replay</span>    <span class="token keyword">let</span> segments <span class="token operator">=</span> <span class="token function">find_wal_segments_for_time_range</span><span class="token punctuation">(</span>archive_dir<span class="token punctuation">,</span> target_time<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Replay WAL up to target time</span>    <span class="token keyword">for</span> segment <span class="token keyword">in</span> segments <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> records <span class="token operator">=</span> <span class="token function">read_wal_segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>segment<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> record <span class="token keyword">in</span> records <span class="token punctuation">&#123;</span>            <span class="token comment">// Check if we've passed target time</span>            <span class="token keyword">if</span> record<span class="token punctuation">.</span>timestamp <span class="token operator">></span> target_time <span class="token punctuation">&#123;</span>                <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Reached target time, stopping recovery"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token function">apply_redo_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-用-Rust-构建的挑战">8 用 Rust 构建的挑战</h2><h3 id="挑战-1：fsync-和持久性">挑战 1：fsync 和持久性</h3><p><strong>问题：</strong> Rust 的 <code>File::sync_all()</code> 是正确的，但容易忘记。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Missing fsync - data NOT durable!</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> record <span class="token operator">=</span> <span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">commit</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// Forgot to flush!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ Correct</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> record <span class="token operator">=</span> <span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">commit</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// fsync!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>教训：</strong> 将 WAL 操作包装在强制刷新的安全抽象中。</p><hr /><h3 id="挑战-2：LSN-顺序和并发">挑战 2：LSN 顺序和并发</h3><p><strong>问题：</strong> 多个线程附加 WAL 记录必须获得单调递增的 LSN。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition - LSNs not ordered!</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_lsn <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">;</span>  <span class="token comment">// Not atomic!</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>current_lsn <span class="token operator">=</span> lsn<span class="token punctuation">;</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ Correct - atomic LSN allocation</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>last_lsn<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-3：部分写入和校验和">挑战 3：部分写入和校验和</h3><p><strong>问题：</strong> WAL 写入期间崩溃 = 磁盘上的部分记录。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Solution: Checksums + length prefix</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize_record</span><span class="token punctuation">(</span>record<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> buffer<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> data_start <span class="token operator">=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Write placeholder for length (fill in later)</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Write record fields</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ... more fields ...</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">.</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Calculate checksum over everything except checksum field</span>    <span class="token keyword">let</span> checksum <span class="token operator">=</span> <span class="token namespace">crc32<span class="token punctuation">::</span></span><span class="token function">calculate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">[</span>data_start <span class="token operator">+</span> <span class="token number">4</span><span class="token punctuation">..</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checksum<span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Fill in length</span>    <span class="token keyword">let</span> total_len <span class="token operator">=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> data_start<span class="token punctuation">;</span>    buffer<span class="token punctuation">[</span>data_start<span class="token punctuation">..</span>data_start <span class="token operator">+</span> <span class="token number">4</span><span class="token punctuation">]</span>        <span class="token punctuation">.</span><span class="token function">copy_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>total_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">deserialize_record</span><span class="token punctuation">(</span>buffer<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Read length</span>    <span class="token keyword">let</span> length <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_le_bytes</span><span class="token punctuation">(</span>buffer<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">..</span><span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">try_into</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>    <span class="token comment">// Verify we have enough data</span>    <span class="token keyword">if</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> length <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">WalError</span><span class="token punctuation">::</span><span class="token class-name">PartialWrite</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Verify checksum</span>    <span class="token keyword">let</span> stored_checksum <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_le_bytes</span><span class="token punctuation">(</span>        buffer<span class="token punctuation">[</span>length <span class="token operator">-</span> <span class="token number">4</span><span class="token punctuation">..</span>length<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">try_into</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span>    <span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> calculated <span class="token operator">=</span> <span class="token namespace">crc32<span class="token punctuation">::</span></span><span class="token function">calculate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">[</span><span class="token number">4</span><span class="token punctuation">..</span>length <span class="token operator">-</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">if</span> stored_checksum <span class="token operator">!=</span> calculated <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">WalError</span><span class="token punctuation">::</span><span class="token class-name">ChecksumMismatch</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// ... parse record ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="9-AI-如何加速这项工作">9 AI 如何加速这项工作</h2><h3 id="AI-做对了什么">AI 做对了什么</h3><table><thead><tr><th>任务</th><th>AI 贡献</th></tr></thead><tbody><tr><td><strong>ARIES 阶段</strong></td><td>清楚解释 analysis/redo/undo</td></tr><tr><td><strong>LSN 结构</strong></td><td>建议段/偏移编码</td></tr><tr><td><strong>检查点设计</strong></td><td>概述 fuzzy vs. sharp 权衡</td></tr><tr><td><strong>CLR 记录</strong></td><td>解释补偿日志记录目的</td></tr></tbody></table><hr /><h3 id="AI-做错了什么">AI 做错了什么</h3><table><thead><tr><th>问题</th><th>发生了什么</th></tr></thead><tbody><tr><td><strong>Redo 逻辑</strong></td><td>初稿只 redo 已提交事务（错误！Redo ALL，然后 undo）</td></tr><tr><td><strong>Undo 顺序</strong></td><td>建议正向顺序而不是反向（LIFO）</td></tr><tr><td><strong>Page LSN</strong></td><td>忽略了 page LSN 用于跳过冗余 redo</td></tr></tbody></table><p><strong>模式：</strong> ARIES 很微妙。「redo all, undo some」的洞察是反直觉的。</p><hr /><h3 id="示例：理解-Redo-哲学">示例：理解 Redo 哲学</h3><p><strong>我问 AI 的问题：</strong></p><blockquote><p>“为什么 ARIES redo 未提交的事务？我们不应该只 redo 已提交的吗？”</p></blockquote><p><strong>我学到的：</strong></p><ol><li><strong>Redo 阶段：</strong> 将数据库带到精确的崩溃状态（包括未提交的变更）</li><li><strong>Undo 阶段：</strong> 回滚未提交事务</li><li><strong>为什么？</strong> 比在 redo 期间追踪依赖关系更简单</li><li><strong>关键洞察：</strong> Redo 是幂等的，undo 必须记录（CLRs）</li></ol><p><strong>结果：</strong> 正确的 redo 实现：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Redo ALL records, not just committed</span><span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> record<span class="token punctuation">.</span>lsn <span class="token punctuation">&#123;</span>    <span class="token function">apply_redo</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> <span class="token operator">&amp;</span>page<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Apply regardless of txn status</span><span class="token punctuation">&#125;</span><span class="token comment">// Undo phase will handle uncommitted transactions</span></code></pre><hr /><h2 id="总结：WAL-和-ARIES-一张图">总结：WAL 和 ARIES 一张图</h2><pre class="language-MERMAID_BASE64_613" data-language="MERMAID_BASE64_613"><code class="language-MERMAID_BASE64_613">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiTm9ybWFsIE9wZXJhdGlvbiIKICAgICAgICBBW1RyYW5zYWN0aW9uXSAtLT4gQltXcml0ZSBXQUwgUmVjb3JkXQogICAgICAgIEIgLS0+IENbRmx1c2ggV0FMIGZzeW5jXQogICAgICAgIEMgLS0+IERbTW9kaWZ5IFBhZ2VdCiAgICAgICAgRCAtLT4gRVtNYXJrIERpcnR5XQogICAgICAgIEUgLS0+IEZbQUNLIHRvIENsaWVudF0KICAgICAgICBGIC0tPiBHW0NoZWNrcG9pbnQgTGF0ZXJdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiQ3Jhc2ggUmVjb3ZlcnkiCiAgICAgICAgSFvimqEgQ1JBU0gg4pqhXSAtLT4gSVtSZXN0YXJ0IERhdGFiYXNlXQogICAgICAgIEkgLS0+IEpbUGhhc2UgMTogQW5hbHlzaXNdCiAgICAgICAgSiAtLT4gS1tGaW5kIEFjdGl2ZSBUcmFuc2FjdGlvbnNdCiAgICAgICAgSyAtLT4gTFtQaGFzZSAyOiBSZWRvXQogICAgICAgIEwgLS0+IE1bUmVwbGF5IEFsbCBXQUwgZnJvbSBDaGVja3BvaW50XQogICAgICAgIE0gLS0+IE5bUGhhc2UgMzogVW5kb10KICAgICAgICBOIC0tPiBPW1JvbGxiYWNrIExvc2VyIFRyYW5zYWN0aW9uc10KICAgICAgICBPIC0tPiBQW0RhdGFiYXNlIENvbnNpc3RlbnRdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiV0FMIFN0cnVjdHVyZSIKICAgICAgICBRW1dBTCBTZWdtZW50IDFdIC0tPiBSW1dBTCBTZWdtZW50IDJdCiAgICAgICAgUiAtLT4gU1tXQUwgU2VnbWVudCAzXQogICAgICAgIFRbQ2hlY2twb2ludCBSZWNvcmRdIC0uLT4gUQogICAgZW5kCgogICAgc3R5bGUgQyBmaWxsOiNmZmYzZTAsc3Ryb2tlOiNmNTdjMDAKICAgIHN0eWxlIEogZmlsbDojZTNmMmZkLHN0cm9rZTojMTk3NmQyCiAgICBzdHlsZSBMIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgTiBmaWxsOiNlM2YyZmQsc3Ryb2tlOiMxOTc2ZDIKICAgIHN0eWxlIFAgZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNj</code></pre><p><strong>关键要点：</strong></p><table><thead><tr><th>概念</th><th>为什么重要</th></tr></thead><tbody><tr><td><strong>WAL</strong></td><td>不牺牲性能的持久性</td></tr><tr><td><strong>LSN</strong></td><td>所有变更的总顺序</td></tr><tr><td><strong>检查点</strong></td><td>限制恢复时间</td></tr><tr><td><strong>ARIES Analysis</strong></td><td>确定需要恢复的内容</td></tr><tr><td><strong>ARIES Redo</strong></td><td>replay 到精确崩溃状态</td></tr><tr><td><strong>ARIES Undo</strong></td><td>回滚未提交的工作</td></tr><tr><td><strong>CLRs</strong></td><td>幂等的 undo，防止重复 undo</td></tr></tbody></table><hr /><p><strong>进一步阅读：</strong></p><ul><li>“ARIES: A Transaction Recovery Method Supporting Fine Granularity Locking” by Mohan et al. (1992)</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/transam/xlog.c"><code>src/backend/access/transam/xlog.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/transam/xlogfuncs.c"><code>src/backend/access/transam/xlogfuncs.c</code></a></li><li>“Database Management Systems” by Ramakrishnan (Ch. 17: Recovery)</li><li>“Readings in Database Systems” (Red Book) - ARIES chapter</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第四部分：实现预写日志和 ARIES 恢复算法。深入探讨持久性、检查点，以及让数据库从崩溃中恢复的三阶段恢复。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 建構 PostgreSQL 相容資料庫：WAL 與 ARIES 崩潰恢復</title>
    <link href="https://neo01.com/zh-TW/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/"/>
    <id>https://neo01.com/zh-TW/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/</id>
    <published>2026-03-03T16:00:00.000Z</published>
    <updated>2026-03-14T03:05:57.271Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-TW/2026/03/Database-Rust-MVCC-Transaction-Manager/">第三部分</a> 中，我們建構了用於併發交易的 MVCC。但有一個可怕的問題我們還沒有回答。</p><p><strong>停電時會發生什麼？</strong></p><pre class="language-none"><code class="language-none">Transaction: UPDATE accounts SET balance &#x3D; 1000 WHERE id &#x3D; 11. Read page into buffer pool2. Modify page in memory (balance &#x3D; 1000)3. Mark page as dirty4. ACK to client: &quot;Done!&quot;5. ⚡ POWER FAILURE ⚡6. Dirty page never written to disk7. Client&#39;s money: GONE 💸</code></pre><p>這就是資料庫使用 <strong>WAL：預寫日誌</strong> 的原因。</p><p>今天：在 Rust 中實作 WAL 和 ARIES 恢復演算法。這就是確保你的資料在崩潰、停電和核心恐慌中存活的程式碼。</p><hr /><h2 id="1-WAL-原則">1 WAL 原則</h2><h3 id="基本規則">基本規則</h3><p><strong>預寫日誌：</strong> 在修改任何資料頁面前，你<strong>必須</strong>將變更寫入 WAL。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ WRONG - data modification before WAL</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> new_data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Modified in memory!</span>    page<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// WAL comes too late</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Too late!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ CORRECT - WAL first</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> new_data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Generate LSN (Log Sequence Number)</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Flush WAL to disk (fsync)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. NOW safe to modify page</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Track which LSN modified this page</span>    page<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>為什麼這樣有效：</strong></p><pre class="language-none"><code class="language-none">Crash at different points:After WAL write, before page modify:→ Recovery replays WAL, data is restored ✓After page modify, before flush:→ WAL on disk, recovery ensures durability ✓After flush to disk:→ Data is durable ✓</code></pre><hr /><h3 id="日誌序列號-LSN">日誌序列號 (LSN)</h3><p>每個 WAL 記錄獲得一個獨特的、單調遞增的識別符：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/lsn.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Lsn</span><span class="token punctuation">(</span><span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">impl</span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INVALID</span><span class="token punctuation">:</span> <span class="token class-name">Lsn</span> <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">invalid</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">==</span> <span class="token number">0</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// LSN encoding: segment_id (high 32) + offset (low 32)</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">from_segment_offset</span><span class="token punctuation">(</span>segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span> offset<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">(</span>segment <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token operator">&lt;&lt;</span> <span class="token number">32</span><span class="token punctuation">)</span> <span class="token operator">|</span> <span class="token punctuation">(</span>offset <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u32</span> <span class="token punctuation">&#123;</span>        <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">>></span> <span class="token number">32</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">offset</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u32</span> <span class="token punctuation">&#123;</span>        <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">&amp;</span> <span class="token number">0xFFFFFFFF</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>LSN 順序保證：</strong></p><pre class="language-none"><code class="language-none">LSN 100: BEGIN txn 1LSN 101: INSERT row A (txn 1)LSN 102: INSERT row B (txn 1)LSN 103: COMMIT txn 1LSN 104: BEGIN txn 2LSN 105: UPDATE row A (txn 2)...LSN 100 &lt; LSN 101 &lt; LSN 102 &lt; ...  (strictly increasing)</code></pre><hr /><h2 id="2-WAL-記錄格式">2 WAL 記錄格式</h2><h3 id="記錄結構">記錄結構</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/record.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>storage<span class="token punctuation">::</span></span><span class="token class-name">PageId</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>transaction<span class="token punctuation">::</span></span><span class="token class-name">TransactionId</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalRecord</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> prev_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>  <span class="token comment">// Link to previous record from same transaction</span>    <span class="token keyword">pub</span> transaction_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> record_type<span class="token punctuation">:</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token class-name">PageId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data<span class="token punctuation">:</span> <span class="token class-name">WalData</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> checksum<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">WalRecordType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Begin</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span> <span class="token punctuation">&#123;</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span> after_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span> <span class="token punctuation">&#123;</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Commit</span><span class="token punctuation">,</span>    <span class="token class-name">Abort</span><span class="token punctuation">,</span>    <span class="token class-name">CheckpointBegin</span><span class="token punctuation">,</span>    <span class="token class-name">CheckpointEnd</span><span class="token punctuation">,</span>    <span class="token class-name">Compensation</span> <span class="token punctuation">&#123;</span> undo_next_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>  <span class="token comment">// For aborts</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">WalData</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">PageImage</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>      <span class="token comment">// Full page image (for checkpoints)</span>    <span class="token class-name">RowData</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Row-level change</span>    <span class="token class-name">IndexEntry</span> <span class="token punctuation">&#123;</span> key<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>磁碟上的實體佈局：</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ WAL Segment File (e.g., 000000010000000000000001)           │├─────────────────────────────────────────────────────────────┤│ PageHeader (16 bytes)                                       │├─────────────────────────────────────────────────────────────┤│ Record 1:                                                   ││   ├─ Length (4 bytes)                                       ││   ├─ LSN (8 bytes)                                          ││   ├─ Prev LSN (8 bytes)                                     ││   ├─ Transaction ID (4 bytes)                               ││   ├─ Record Type (1 byte)                                   ││   ├─ Page ID (8 bytes)                                      ││   ├─ Offset (2 bytes)                                       ││   ├─ Data Length (4 bytes)                                  ││   ├─ Data (variable)                                        ││   └─ Checksum (4 bytes)                                     │├─────────────────────────────────────────────────────────────┤│ Record 2:                                                   ││   ...                                                       │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="實體-vs-邏輯-WAL">實體 vs. 邏輯 WAL</h3><p><strong>PostgreSQL 使用實體 WAL</strong>（頁面級別變更）：</p><table><thead><tr><th>類型</th><th>記錄內容</th><th>優點</th><th>缺點</th></tr></thead><tbody><tr><td><strong>實體</strong></td><td>頁面上的位元組範圍</td><td>簡單的 replay，精確的變更</td><td>較大的日誌，依賴頁面格式</td></tr><tr><td><strong>邏輯</strong></td><td>SQL 操作 (INSERT/UPDATE)</td><td>較小的日誌，格式獨立</td><td>複雜的 replay，必須重新執行</td></tr></tbody></table><p><strong>Vaultgres 方法：</strong> 實體 WAL 以簡單化（匹配 PostgreSQL）：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">PhysicalWalRecord</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token class-name">PageId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> length<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token punctuation">,</span>  <span class="token comment">// For undo</span>    <span class="token keyword">pub</span> after_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>            <span class="token comment">// For redo</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="3-WAL-管理員實作">3 WAL 管理員實作</h2><h3 id="寫入-WAL-記錄">寫入 WAL 記錄</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/manager.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">File</span><span class="token punctuation">,</span> <span class="token class-name">OpenOptions</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>io<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">Write</span><span class="token punctuation">,</span> <span class="token class-name">Seek</span><span class="token punctuation">,</span> <span class="token class-name">SeekFrom</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>path<span class="token punctuation">::</span></span><span class="token class-name">PathBuf</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">parking_lot<span class="token punctuation">::</span></span><span class="token class-name">Mutex</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalManager</span> <span class="token punctuation">&#123;</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    current_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    current_file<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token class-name">File</span><span class="token operator">></span><span class="token punctuation">,</span>    current_offset<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">,</span>    flush_lsn<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token operator">></span><span class="token punctuation">,</span>    last_lsn<span class="token punctuation">:</span> <span class="token class-name">AtomicU64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token function">create_dir_all</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> manager <span class="token operator">=</span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">::</span><span class="token function">from</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">)</span><span class="token punctuation">,</span>            current_segment<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            current_file<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">create</span><span class="token punctuation">(</span><span class="token string">""</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Placeholder</span>            current_offset<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            flush_lsn<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token constant">INVALID</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            last_lsn<span class="token punctuation">:</span> <span class="token class-name">AtomicU64</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        manager<span class="token punctuation">.</span><span class="token function">open_or_create_segment</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>manager<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">open_or_create_segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> file <span class="token operator">=</span> <span class="token class-name">OpenOptions</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">create</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span><span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=</span> file<span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment <span class="token operator">=</span> segment<span class="token punctuation">;</span>        <span class="token operator">*</span><span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Assign new LSN</span>        <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>last_lsn<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Serialize record</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buffer <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">serialize_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> lsn<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Check if we need a new segment</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> offset <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> <span class="token operator">*</span>offset <span class="token operator">+</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span> <span class="token operator">></span> <span class="token constant">SEGMENT_SIZE</span> <span class="token punctuation">&#123;</span>            <span class="token function">drop</span><span class="token punctuation">(</span>offset<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">rotate_segment</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            offset <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Write to file</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> file <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        file<span class="token punctuation">.</span><span class="token function">seek</span><span class="token punctuation">(</span><span class="token class-name">SeekFrom</span><span class="token punctuation">::</span><span class="token class-name">Start</span><span class="token punctuation">(</span><span class="token operator">*</span>offset <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        file<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span>offset <span class="token operator">+=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">flush</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> target_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> flush_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>flush_lsn<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Already flushed?</span>        <span class="token keyword">if</span> <span class="token operator">*</span>flush_lsn <span class="token operator">>=</span> target_lsn <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Sync to disk</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sync_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span>flush_lsn <span class="token operator">=</span> target_lsn<span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="讀取-WAL-記錄">讀取 WAL 記錄</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> start_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalIterator</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> segment <span class="token operator">=</span> start_lsn<span class="token punctuation">.</span><span class="token function">segment</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> start_lsn<span class="token punctuation">.</span><span class="token function">offset</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> file <span class="token operator">=</span> <span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">WalIterator</span> <span class="token punctuation">&#123;</span>            current_file<span class="token punctuation">:</span> file<span class="token punctuation">,</span>            current_segment<span class="token punctuation">:</span> segment<span class="token punctuation">,</span>            current_offset<span class="token punctuation">:</span> offset<span class="token punctuation">,</span>            wal_dir<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalIterator</span> <span class="token punctuation">&#123;</span>    current_file<span class="token punctuation">:</span> <span class="token class-name">File</span><span class="token punctuation">,</span>    current_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    current_offset<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Iterator</span> <span class="token keyword">for</span> <span class="token class-name">WalIterator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">type</span> <span class="token type-definition class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span><span class="token punctuation">;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token class-name">Item</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Try to read record at current position</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_record</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Ok</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// End of segment, try next segment</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_file <span class="token operator">=</span> <span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_record</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>r<span class="token closure-punctuation punctuation">|</span></span> r<span class="token punctuation">.</span><span class="token function">ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">flatten</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Err</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-檢查點：限制恢復時間">4 檢查點：限制恢復時間</h2><h3 id="問題：無邊界的-replay">問題：無邊界的 replay</h3><p><strong>沒有檢查點：</strong></p><pre class="language-none"><code class="language-none">Day 1: Database createdDay 30: Crash!Recovery: Replay 30 days of WAL records 😱</code></pre><p><strong>解決方案：</strong> 定期檢查點。</p><hr /><h3 id="檢查點流程">檢查點流程</h3><pre class="language-MERMAID_BASE64_614" data-language="MERMAID_BASE64_614"><code class="language-MERMAID_BASE64_614">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBEQiBhcyBEYXRhYmFzZQogICAgcGFydGljaXBhbnQgV0FMIGFzIFdBTCBNYW5hZ2VyCiAgICBwYXJ0aWNpcGFudCBCUCBhcyBCdWZmZXIgUG9vbAogICAgcGFydGljaXBhbnQgQ0tQVCBhcyBDaGVja3BvaW50IEZpbGUKCiAgICBEQi0+PkRCOiAxLiBXcml0ZSBDSEVDS1BPSU5UX0JFR0lOIHJlY29yZAogICAgREItPj5CUDogMi4gRmx1c2ggYWxsIGRpcnR5IHBhZ2VzCiAgICBCUC0tPj5EQjogUGFnZXMgd3JpdHRlbiB0byBkaXNrCiAgICBEQi0+PkRCOiAzLiBXcml0ZSBDSEVDS1BPSU5UX0VORCByZWNvcmQKICAgIERCLT4+V0FMOiA0LiBGbHVzaCBXQUwgdG8gZGlzawogICAgREItPj5DS1BUOiA1LiBTYXZlIGNoZWNrcG9pbnQgTFNOCiAgICBDS1BULS0+PkRCOiBDaGVja3BvaW50IGNvbXBsZXRl</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/checkpoint.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> checkpoint_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> active_transactions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> dirty_pages<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// page_id → page_lsn</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">create_checkpoint</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Log checkpoint begin</span>        <span class="token keyword">let</span> begin_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. Flush all dirty pages (this is the expensive part!)</span>        buffer_pool<span class="token punctuation">.</span><span class="token function">flush_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 3. Get current state</span>        <span class="token keyword">let</span> checkpoint <span class="token operator">=</span> <span class="token class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>            checkpoint_lsn<span class="token punctuation">:</span> begin_lsn<span class="token punctuation">,</span>            active_transactions<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_active_transactions</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            dirty_pages<span class="token punctuation">:</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_dirty_pages</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// 4. Log checkpoint end</span>        <span class="token keyword">let</span> end_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_end</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 5. Flush WAL</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 6. Save checkpoint to known location</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">save_checkpoint_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Fuzzy-檢查點-PostgreSQL-風格">Fuzzy 檢查點 (PostgreSQL 風格)</h3><p><strong>Sharp 檢查點：</strong> 檢查點期間阻塞所有寫入。簡單但會造成停頓。</p><p><strong>Fuzzy 檢查點：</strong> 檢查點期間允許寫入。複雜但無停頓。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Fuzzy checkpoint approach</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">create_fuzzy_checkpoint</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Record checkpoint start LSN</span>    <span class="token keyword">let</span> start_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Write CHECKPOINT_BEGIN (don't block)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Get list of dirty pages (snapshot)</span>    <span class="token keyword">let</span> dirty_pages <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_dirty_pages_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 4. Flush dirty pages in background (don't block new writes)</span>    <span class="token keyword">let</span> checkpoint_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>page_id<span class="token punctuation">,</span> page_lsn<span class="token punctuation">)</span> <span class="token keyword">in</span> dirty_pages <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> checkpoint_lsn <span class="token punctuation">&#123;</span>            buffer_pool<span class="token punctuation">.</span><span class="token function">flush_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// 5. Write CHECKPOINT_END with final LSN</span>    <span class="token keyword">let</span> end_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_end_at</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>        checkpoint_lsn<span class="token punctuation">:</span> end_lsn<span class="token punctuation">,</span>        <span class="token comment">// ...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-ARIES：恢復演算法">5 ARIES：恢復演算法</h2><h3 id="三個階段">三個階段</h3><p>ARIES = <strong>A</strong>lgorithm for <strong>R</strong>ecovery and <strong>I</strong>solation <strong>E</strong>xploiting <strong>S</strong>emantics</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    ARIES Recovery                           │├─────────────────────────────────────────────────────────────┤│ Phase 1: ANALYSIS                                           ││ - Scan WAL from last checkpoint                             ││ - Determine which transactions were active at crash         ││ - Build dirty page table                                    │├─────────────────────────────────────────────────────────────┤│ Phase 2: REDO                                               ││ - Replay ALL logged changes from analysis end               ││ - Bring database to exact crash state                       ││ - Skip pages already on disk (using page LSN)               │├─────────────────────────────────────────────────────────────┤│ Phase 3: UNDO                                               ││ - Roll back all uncommitted transactions                    ││ - Write Compensation Log Records (CLRs)                     ││ - Database is now consistent                                │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="第一階段：Analysis">第一階段：Analysis</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/analysis.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AnalysisResult</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> transactions_at_crash<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> dirty_page_table<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// page → first LSN that dirtied it</span>    <span class="token keyword">pub</span> redo_start_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze</span><span class="token punctuation">(</span>wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalManager</span><span class="token punctuation">,</span> checkpoint<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> txn_status<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> dirty_page_table<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Initialize from checkpoint</span>    <span class="token keyword">for</span> txn <span class="token keyword">in</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">.</span>active_transactions <span class="token punctuation">&#123;</span>        txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span><span class="token operator">*</span>txn<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>page_id<span class="token punctuation">,</span> page_lsn<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">.</span>dirty_pages <span class="token punctuation">&#123;</span>        dirty_page_table<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span><span class="token operator">*</span>page_id<span class="token punctuation">,</span> <span class="token operator">*</span>page_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Scan WAL from checkpoint</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> iterator <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">read_from</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record_result <span class="token keyword">in</span> iterator <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> record <span class="token operator">=</span> record_result<span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">match</span> record<span class="token punctuation">.</span>record_type <span class="token punctuation">&#123;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Begin</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Commit</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Committed</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Abort</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Aborted</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Track first LSN that dirtied each page</span>                dirty_page_table<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Find minimum redo LSN</span>    <span class="token keyword">let</span> redo_start_lsn <span class="token operator">=</span> dirty_page_table<span class="token punctuation">.</span><span class="token function">values</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">copied</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">AnalysisResult</span> <span class="token punctuation">&#123;</span>        transactions_at_crash<span class="token punctuation">:</span> txn_status<span class="token punctuation">,</span>        dirty_page_table<span class="token punctuation">,</span>        redo_start_lsn<span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="第二階段：Redo">第二階段：Redo</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/redo.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">redo</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    analysis<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> iterator <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">read_from</span><span class="token punctuation">(</span>analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record_result <span class="token keyword">in</span> iterator <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> record <span class="token operator">=</span> record_result<span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Only redo committed transactions' changes</span>        <span class="token comment">// (We redo ALL changes first, undo uncommitted later)</span>        <span class="token keyword">match</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>record_type <span class="token punctuation">&#123;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Check if page needs redo</span>                <span class="token keyword">let</span> page <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> page_lsn <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Only apply if page is older than this LSN</span>                <span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> record<span class="token punctuation">.</span>lsn <span class="token punctuation">&#123;</span>                    <span class="token function">apply_redo</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> <span class="token operator">&amp;</span>page<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                    page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Else: page already has this change (written before crash)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">apply_redo</span><span class="token punctuation">(</span>record<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> page<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Page</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>data <span class="token punctuation">&#123;</span>        <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">PageImage</span><span class="token punctuation">(</span>image<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Full page image (from checkpoint)</span>            page<span class="token punctuation">.</span><span class="token function">copy_from_slice</span><span class="token punctuation">(</span>image<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">RowData</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Partial page update</span>            page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> data<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>關鍵洞察：</strong> Redo 是<strong>冪等的</strong>。多次執行產生相同的結果。</p><hr /><h3 id="第三階段：Undo">第三階段：Undo</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/undo.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">undo</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    analysis<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Find transactions to undo (active at crash, not committed)</span>    <span class="token keyword">let</span> losers<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span> <span class="token operator">=</span> analysis<span class="token punctuation">.</span>transactions_at_crash        <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> status<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span><span class="token operator">*</span>status <span class="token operator">==</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>txn<span class="token punctuation">,</span> _<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span>txn<span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Undo in reverse order (LIFO - Last Committed, First Undone)</span>    <span class="token keyword">for</span> txn_id <span class="token keyword">in</span> losers<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">rev</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token function">undo_transaction</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> txn_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">undo_transaction</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Find all records for this transaction (in reverse)</span>    <span class="token keyword">let</span> records <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">get_transaction_records</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record <span class="token keyword">in</span> records<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">rev</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Write Compensation Log Record (CLR)</span>        <span class="token keyword">let</span> clr <span class="token operator">=</span> <span class="token class-name">WalRecord</span> <span class="token punctuation">&#123;</span>            lsn<span class="token punctuation">:</span> wal<span class="token punctuation">.</span><span class="token function">next_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            prev_lsn<span class="token punctuation">:</span> record<span class="token punctuation">.</span>lsn<span class="token punctuation">,</span>            transaction_id<span class="token punctuation">:</span> txn_id<span class="token punctuation">,</span>            record_type<span class="token punctuation">:</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Compensation</span> <span class="token punctuation">&#123;</span>                undo_next_lsn<span class="token punctuation">:</span> record<span class="token punctuation">.</span>prev_lsn<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            page_id<span class="token punctuation">:</span> record<span class="token punctuation">.</span>page_id<span class="token punctuation">,</span>            offset<span class="token punctuation">:</span> record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span>            data<span class="token punctuation">:</span> <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">RowData</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>before_image<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            checksum<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Calculate checksum</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> clr_lsn <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>clr<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Apply undo (restore before_image)</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>before_image<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>before_image <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> page <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> before_image<span class="token punctuation">)</span><span class="token punctuation">;</span>            page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>clr_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Log transaction abort</span>    wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">abort</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    wal<span class="token punctuation">.</span><span class="token function">flush_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="完整的恢復流程">完整的恢復流程</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/mod.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">recover</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    data_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">RecoveryStats</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Find last checkpoint</span>    <span class="token keyword">let</span> checkpoint <span class="token operator">=</span> <span class="token function">find_last_checkpoint</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> data_dir<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Starting recovery from checkpoint LSN &#123;&#125;"</span><span class="token punctuation">,</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Phase 1: Analysis</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 1: Analysis..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> analysis <span class="token operator">=</span> <span class="token function">analyze</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Found &#123;&#125; active transactions at crash"</span><span class="token punctuation">,</span>             analysis<span class="token punctuation">.</span>transactions_at_crash<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Redo will start from LSN &#123;&#125;"</span><span class="token punctuation">,</span> analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Phase 2: Redo</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 2: Redo..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token function">redo</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> <span class="token operator">&amp;</span>analysis<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Redo complete"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 4. Phase 3: Undo</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 3: Undo..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token function">undo</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> <span class="token operator">&amp;</span>analysis<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Undo complete"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 5. Truncate old WAL (optional)</span>    <span class="token function">truncate_wal_before</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">RecoveryStats</span> <span class="token punctuation">&#123;</span>        checkpoint_lsn<span class="token punctuation">:</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">,</span>        redo_start_lsn<span class="token punctuation">:</span> analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">,</span>        transactions_aborted<span class="token punctuation">:</span> analysis<span class="token punctuation">.</span>transactions_at_crash            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> s<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span><span class="token operator">*</span>s <span class="token operator">==</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-恢復範例：逐步說明">6 恢復範例：逐步說明</h2><h3 id="崩潰情境">崩潰情境</h3><pre class="language-none"><code class="language-none">Time    LSN    Transaction    Action─────────────────────────────────────────────────────10:00   100    CKPT           Checkpoint created10:01   101    T1 (xid&#x3D;1)     BEGIN10:02   102    T1             INSERT row A (balance&#x3D;100)10:03   103    T2 (xid&#x3D;2)     BEGIN10:04   104    T2             INSERT row B (balance&#x3D;200)10:05   105    T1             COMMIT10:06   106    T2             UPDATE row B (balance&#x3D;250)10:07   ─────  ⚡ CRASH ⚡</code></pre><p><strong>崩潰時的狀態：</strong></p><ul><li>T1: Committed (LSN 105)</li><li>T2: Active (not committed)</li><li>Dirty pages: A (LSN 102), B (LSN 106)</li></ul><hr /><h3 id="恢復執行">恢復執行</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ Phase 1: ANALYSIS                                           │├─────────────────────────────────────────────────────────────┤│ Start from checkpoint LSN 100                               ││ Scan records 100-106                                        ││                                                             ││ Result:                                                     ││   - T1: Committed (LSN 105)                                 ││   - T2: Active (loser!)                                     ││   - Dirty pages: A→102, B→106                               ││   - Redo start: LSN 102                                     │└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐│ Phase 2: REDO                                               │├─────────────────────────────────────────────────────────────┤│ Replay from LSN 102:                                        ││                                                             ││ LSN 102: INSERT row A                                       ││   → Check page A LSN                                        ││   → If page LSN &lt; 102: apply insert                         ││   → Else: skip (already on disk)                            ││                                                             ││ LSN 104: INSERT row B                                       ││   → Apply if needed                                         ││                                                             ││ LSN 106: UPDATE row B                                       ││   → Apply if needed                                         ││                                                             ││ Result: Database at exact crash state                       │└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐│ Phase 3: UNDO                                               │├─────────────────────────────────────────────────────────────┤│ Loser transactions: T2                                      ││                                                             ││ Undo T2 (in reverse order):                                 ││   1. Undo LSN 106 (UPDATE B: 200→250)                       ││      → Write CLR: undo_next_lsn &#x3D; 104                       ││      → Restore B to balance&#x3D;200                             ││                                                             ││   2. Undo LSN 104 (INSERT B)                                ││      → Write CLR: undo_next_lsn &#x3D; 103                       ││      → Delete row B                                         ││                                                             ││   3. Log T2 ABORT                                           ││                                                             ││ Result: T2&#39;s changes rolled back, T1&#39;s changes preserved    │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="7-WAL-歸檔和時間點恢復">7 WAL 歸檔和時間點恢復</h2><h3 id="WAL-歸檔">WAL 歸檔</h3><p><strong>連續歸檔：</strong> 在重用前將 WAL 區段複製到安全儲存。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/archiver.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalArchiver</span> <span class="token punctuation">&#123;</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    archive_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    archive_timeout<span class="token punctuation">:</span> <span class="token class-name">Duration</span><span class="token punctuation">,</span>    last_archived_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalArchiver</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">archive_ready_segments</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Find completed segments (can't be overwritten)</span>        <span class="token keyword">for</span> segment <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_ready_segments</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> src <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> dst <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>archive_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;.backup"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// Copy to archive (could be remote storage like S3)</span>            <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token function">copy</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>src<span class="token punctuation">,</span> <span class="token operator">&amp;</span>dst<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>last_archived_segment <span class="token operator">=</span> segment<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="時間點恢復-PITR">時間點恢復 (PITR)</h3><pre class="language-none"><code class="language-none">Goal: Restore database to state at 2026-03-22 14:30:001. Restore base backup from 2026-03-22 00:00:002. Replay WAL segments from archive3. Stop replay at target time (14:30:00)4. Database restored to exact point in time</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/pitr.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">recover_to_point_in_time</span><span class="token punctuation">(</span>    base_backup<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>    archive_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>    target_time<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Restore base backup</span>    <span class="token function">restore_base_backup</span><span class="token punctuation">(</span>base_backup<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Find WAL segments to replay</span>    <span class="token keyword">let</span> segments <span class="token operator">=</span> <span class="token function">find_wal_segments_for_time_range</span><span class="token punctuation">(</span>archive_dir<span class="token punctuation">,</span> target_time<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Replay WAL up to target time</span>    <span class="token keyword">for</span> segment <span class="token keyword">in</span> segments <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> records <span class="token operator">=</span> <span class="token function">read_wal_segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>segment<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> record <span class="token keyword">in</span> records <span class="token punctuation">&#123;</span>            <span class="token comment">// Check if we've passed target time</span>            <span class="token keyword">if</span> record<span class="token punctuation">.</span>timestamp <span class="token operator">></span> target_time <span class="token punctuation">&#123;</span>                <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Reached target time, stopping recovery"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token function">apply_redo_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-用-Rust-建構的挑戰">8 用 Rust 建構的挑戰</h2><h3 id="挑戰-1：fsync-和持久性">挑戰 1：fsync 和持久性</h3><p><strong>問題：</strong> Rust 的 <code>File::sync_all()</code> 是正確的，但容易忘記。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Missing fsync - data NOT durable!</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> record <span class="token operator">=</span> <span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">commit</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// Forgot to flush!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ Correct</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> record <span class="token operator">=</span> <span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">commit</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// fsync!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>教訓：</strong> 將 WAL 操作包裝在強制刷新的安全抽象中。</p><hr /><h3 id="挑戰-2：LSN-順序和併發">挑戰 2：LSN 順序和併發</h3><p><strong>問題：</strong> 多個執行緒附加 WAL 記錄必須獲得單調遞增的 LSN。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition - LSNs not ordered!</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_lsn <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">;</span>  <span class="token comment">// Not atomic!</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>current_lsn <span class="token operator">=</span> lsn<span class="token punctuation">;</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ Correct - atomic LSN allocation</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>last_lsn<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑戰-3：部分寫入和校驗和">挑戰 3：部分寫入和校驗和</h3><p><strong>問題：</strong> WAL 寫入期間崩潰 = 磁碟上的部分記錄。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Solution: Checksums + length prefix</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize_record</span><span class="token punctuation">(</span>record<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> buffer<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> data_start <span class="token operator">=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Write placeholder for length (fill in later)</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Write record fields</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ... more fields ...</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">.</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Calculate checksum over everything except checksum field</span>    <span class="token keyword">let</span> checksum <span class="token operator">=</span> <span class="token namespace">crc32<span class="token punctuation">::</span></span><span class="token function">calculate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">[</span>data_start <span class="token operator">+</span> <span class="token number">4</span><span class="token punctuation">..</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checksum<span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Fill in length</span>    <span class="token keyword">let</span> total_len <span class="token operator">=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> data_start<span class="token punctuation">;</span>    buffer<span class="token punctuation">[</span>data_start<span class="token punctuation">..</span>data_start <span class="token operator">+</span> <span class="token number">4</span><span class="token punctuation">]</span>        <span class="token punctuation">.</span><span class="token function">copy_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>total_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">deserialize_record</span><span class="token punctuation">(</span>buffer<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Read length</span>    <span class="token keyword">let</span> length <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_le_bytes</span><span class="token punctuation">(</span>buffer<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">..</span><span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">try_into</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>    <span class="token comment">// Verify we have enough data</span>    <span class="token keyword">if</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> length <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">WalError</span><span class="token punctuation">::</span><span class="token class-name">PartialWrite</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Verify checksum</span>    <span class="token keyword">let</span> stored_checksum <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_le_bytes</span><span class="token punctuation">(</span>        buffer<span class="token punctuation">[</span>length <span class="token operator">-</span> <span class="token number">4</span><span class="token punctuation">..</span>length<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">try_into</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span>    <span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> calculated <span class="token operator">=</span> <span class="token namespace">crc32<span class="token punctuation">::</span></span><span class="token function">calculate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">[</span><span class="token number">4</span><span class="token punctuation">..</span>length <span class="token operator">-</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">if</span> stored_checksum <span class="token operator">!=</span> calculated <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">WalError</span><span class="token punctuation">::</span><span class="token class-name">ChecksumMismatch</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// ... parse record ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="9-AI-如何加速這項工作">9 AI 如何加速這項工作</h2><h3 id="AI-做對了什麼">AI 做對了什麼</h3><table><thead><tr><th>任務</th><th>AI 貢獻</th></tr></thead><tbody><tr><td><strong>ARIES 階段</strong></td><td>清楚解釋 analysis/redo/undo</td></tr><tr><td><strong>LSN 結構</strong></td><td>建議區段/偏移編碼</td></tr><tr><td><strong>檢查點設計</strong></td><td>概述 fuzzy vs. sharp 權衡</td></tr><tr><td><strong>CLR 記錄</strong></td><td>解釋補償日誌記錄目的</td></tr></tbody></table><hr /><h3 id="AI-做錯了什麼">AI 做錯了什麼</h3><table><thead><tr><th>問題</th><th>發生什麼事</th></tr></thead><tbody><tr><td><strong>Redo 邏輯</strong></td><td>初稿只 redo 已提交交易（錯誤！Redo ALL，然後 undo）</td></tr><tr><td><strong>Undo 順序</strong></td><td>建議正向順序而不是反向（LIFO）</td></tr><tr><td><strong>Page LSN</strong></td><td>忽略了 page LSN 用於跳過冗餘 redo</td></tr></tbody></table><p><strong>模式：</strong> ARIES 很微妙。「redo all, undo some」的洞察是反直覺的。</p><hr /><h3 id="範例：理解-Redo-哲學">範例：理解 Redo 哲學</h3><p><strong>我問 AI 的問題：</strong></p><blockquote><p>“為什麼 ARIES redo 未提交的交易？我們不應該只 redo 已提交的嗎？”</p></blockquote><p><strong>我學到的：</strong></p><ol><li><strong>Redo 階段：</strong> 將資料庫帶到精確的崩潰狀態（包括未提交的變更）</li><li><strong>Undo 階段：</strong> 回滾未提交交易</li><li><strong>為什麼？</strong> 比在 redo 期間追蹤依賴關係更簡單</li><li><strong>關鍵洞察：</strong> Redo 是冪等的，undo 必須記錄（CLRs）</li></ol><p><strong>結果：</strong> 正確的 redo 實作：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Redo ALL records, not just committed</span><span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> record<span class="token punctuation">.</span>lsn <span class="token punctuation">&#123;</span>    <span class="token function">apply_redo</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> <span class="token operator">&amp;</span>page<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Apply regardless of txn status</span><span class="token punctuation">&#125;</span><span class="token comment">// Undo phase will handle uncommitted transactions</span></code></pre><hr /><h2 id="總結：WAL-和-ARIES-一張圖">總結：WAL 和 ARIES 一張圖</h2><pre class="language-MERMAID_BASE64_615" data-language="MERMAID_BASE64_615"><code class="language-MERMAID_BASE64_615">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiTm9ybWFsIE9wZXJhdGlvbiIKICAgICAgICBBW1RyYW5zYWN0aW9uXSAtLT4gQltXcml0ZSBXQUwgUmVjb3JkXQogICAgICAgIEIgLS0+IENbRmx1c2ggV0FMIGZzeW5jXQogICAgICAgIEMgLS0+IERbTW9kaWZ5IFBhZ2VdCiAgICAgICAgRCAtLT4gRVtNYXJrIERpcnR5XQogICAgICAgIEUgLS0+IEZbQUNLIHRvIENsaWVudF0KICAgICAgICBGIC0tPiBHW0NoZWNrcG9pbnQgTGF0ZXJdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiQ3Jhc2ggUmVjb3ZlcnkiCiAgICAgICAgSFvimqEgQ1JBU0gg4pqhXSAtLT4gSVtSZXN0YXJ0IERhdGFiYXNlXQogICAgICAgIEkgLS0+IEpbUGhhc2UgMTogQW5hbHlzaXNdCiAgICAgICAgSiAtLT4gS1tGaW5kIEFjdGl2ZSBUcmFuc2FjdGlvbnNdCiAgICAgICAgSyAtLT4gTFtQaGFzZSAyOiBSZWRvXQogICAgICAgIEwgLS0+IE1bUmVwbGF5IEFsbCBXQUwgZnJvbSBDaGVja3BvaW50XQogICAgICAgIE0gLS0+IE5bUGhhc2UgMzogVW5kb10KICAgICAgICBOIC0tPiBPW1JvbGxiYWNrIExvc2VyIFRyYW5zYWN0aW9uc10KICAgICAgICBPIC0tPiBQW0RhdGFiYXNlIENvbnNpc3RlbnRdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiV0FMIFN0cnVjdHVyZSIKICAgICAgICBRW1dBTCBTZWdtZW50IDFdIC0tPiBSW1dBTCBTZWdtZW50IDJdCiAgICAgICAgUiAtLT4gU1tXQUwgU2VnbWVudCAzXQogICAgICAgIFRbQ2hlY2twb2ludCBSZWNvcmRdIC0uLT4gUQogICAgZW5kCgogICAgc3R5bGUgQyBmaWxsOiNmZmYzZTAsc3Ryb2tlOiNmNTdjMDAKICAgIHN0eWxlIEogZmlsbDojZTNmMmZkLHN0cm9rZTojMTk3NmQyCiAgICBzdHlsZSBMIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgTiBmaWxsOiNlM2YyZmQsc3Ryb2tlOiMxOTc2ZDIKICAgIHN0eWxlIFAgZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNj</code></pre><p><strong>關鍵要點：</strong></p><table><thead><tr><th>概念</th><th>為什麼重要</th></tr></thead><tbody><tr><td><strong>WAL</strong></td><td>不犧牲效能的持久性</td></tr><tr><td><strong>LSN</strong></td><td>所有變更的總順序</td></tr><tr><td><strong>檢查點</strong></td><td>限制恢復時間</td></tr><tr><td><strong>ARIES Analysis</strong></td><td>確定需要恢復的內容</td></tr><tr><td><strong>ARIES Redo</strong></td><td>replay 到精確崩潰狀態</td></tr><tr><td><strong>ARIES Undo</strong></td><td>回滾未提交的工作</td></tr><tr><td><strong>CLRs</strong></td><td>冪等的 undo，防止重複 undo</td></tr></tbody></table><hr /><p><strong>進一步閱讀：</strong></p><ul><li>“ARIES: A Transaction Recovery Method Supporting Fine Granularity Locking” by Mohan et al. (1992)</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/transam/xlog.c"><code>src/backend/access/transam/xlog.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/transam/xlogfuncs.c"><code>src/backend/access/transam/xlogfuncs.c</code></a></li><li>“Database Management Systems” by Ramakrishnan (Ch. 17: Recovery)</li><li>“Readings in Database Systems” (Red Book) - ARIES chapter</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第四部分：實作預寫日誌和 ARIES 恢復演算法。深入探討持久性、檢查點，以及讓資料庫從崩潰中恢復的三階段恢復。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>Database in Rust: WAL and Crash Recovery with ARIES</title>
    <link href="https://neo01.com/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/"/>
    <id>https://neo01.com/2026/03/Database-Rust-WAL-Crash-Recovery-ARIES/</id>
    <published>2026-03-03T16:00:00.000Z</published>
    <updated>2026-03-14T03:06:00.502Z</updated>
    
    <content type="html"><![CDATA[<p>In <a href="/2026/03/Database-Rust-MVCC-Transaction-Manager/">Part 3</a>, we built MVCC for concurrent transactions. But there’s a terrifying question we haven’t answered.</p><p><strong>What happens when the power goes out?</strong></p><pre class="language-none"><code class="language-none">Transaction: UPDATE accounts SET balance &#x3D; 1000 WHERE id &#x3D; 11. Read page into buffer pool2. Modify page in memory (balance &#x3D; 1000)3. Mark page as dirty4. ACK to client: &quot;Done!&quot;5. ⚡ POWER FAILURE ⚡6. Dirty page never written to disk7. Client&#39;s money: GONE 💸</code></pre><p>This is why databases use <strong>WAL: Write-Ahead Logging</strong>.</p><p>Today: implementing WAL and the ARIES recovery algorithm in Rust. This is the code that ensures your data survives crashes, power failures, and kernel panics.</p><hr /><h2 id="1-The-WAL-Principle">1 The WAL Principle</h2><h3 id="The-Fundamental-Rule">The Fundamental Rule</h3><p><strong>Write-Ahead Logging:</strong> Before modifying any data page, you MUST write the change to the WAL.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ WRONG - data modification before WAL</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> new_data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Modified in memory!</span>    page<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// WAL comes too late</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Too late!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ CORRECT - WAL first</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> new_data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Generate LSN (Log Sequence Number)</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Flush WAL to disk (fsync)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. NOW safe to modify page</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> new_data<span class="token punctuation">)</span><span class="token punctuation">;</span>    page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Track which LSN modified this page</span>    page<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Why this works:</strong></p><pre class="language-none"><code class="language-none">Crash at different points:After WAL write, before page modify:→ Recovery replays WAL, data is restored ✓After page modify, before flush:→ WAL on disk, recovery ensures durability ✓After flush to disk:→ Data is durable ✓</code></pre><hr /><h3 id="Log-Sequence-Numbers-LSN">Log Sequence Numbers (LSN)</h3><p>Every WAL record gets a unique, monotonically increasing identifier:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/lsn.rs</span><span class="token attribute attr-name">#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Lsn</span><span class="token punctuation">(</span><span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token keyword">impl</span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INVALID</span><span class="token punctuation">:</span> <span class="token class-name">Lsn</span> <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>value<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span><span class="token punctuation">(</span>value<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">invalid</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">==</span> <span class="token number">0</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// LSN encoding: segment_id (high 32) + offset (low 32)</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">from_segment_offset</span><span class="token punctuation">(</span>segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span> offset<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">(</span>segment <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token operator">&lt;&lt;</span> <span class="token number">32</span><span class="token punctuation">)</span> <span class="token operator">|</span> <span class="token punctuation">(</span>offset <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u32</span> <span class="token punctuation">&#123;</span>        <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">>></span> <span class="token number">32</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token keyword">fn</span> <span class="token function-definition function">offset</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">u32</span> <span class="token punctuation">&#123;</span>        <span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span><span class="token number">0</span> <span class="token operator">&amp;</span> <span class="token number">0xFFFFFFFF</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>LSN ordering guarantees:</strong></p><pre class="language-none"><code class="language-none">LSN 100: BEGIN txn 1LSN 101: INSERT row A (txn 1)LSN 102: INSERT row B (txn 1)LSN 103: COMMIT txn 1LSN 104: BEGIN txn 2LSN 105: UPDATE row A (txn 2)...LSN 100 &lt; LSN 101 &lt; LSN 102 &lt; ...  (strictly increasing)</code></pre><hr /><h2 id="2-WAL-Record-Format">2 WAL Record Format</h2><h3 id="Record-Structure">Record Structure</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/record.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>storage<span class="token punctuation">::</span></span><span class="token class-name">PageId</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>transaction<span class="token punctuation">::</span></span><span class="token class-name">TransactionId</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalRecord</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> prev_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>  <span class="token comment">// Link to previous record from same transaction</span>    <span class="token keyword">pub</span> transaction_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> record_type<span class="token punctuation">:</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token class-name">PageId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data<span class="token punctuation">:</span> <span class="token class-name">WalData</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> checksum<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">WalRecordType</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">Begin</span><span class="token punctuation">,</span>    <span class="token class-name">Insert</span><span class="token punctuation">,</span>    <span class="token class-name">Update</span> <span class="token punctuation">&#123;</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span> after_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Delete</span> <span class="token punctuation">&#123;</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>    <span class="token class-name">Commit</span><span class="token punctuation">,</span>    <span class="token class-name">Abort</span><span class="token punctuation">,</span>    <span class="token class-name">CheckpointBegin</span><span class="token punctuation">,</span>    <span class="token class-name">CheckpointEnd</span><span class="token punctuation">,</span>    <span class="token class-name">Compensation</span> <span class="token punctuation">&#123;</span> undo_next_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>  <span class="token comment">// For aborts</span><span class="token punctuation">&#125;</span><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">WalData</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">PageImage</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>      <span class="token comment">// Full page image (for checkpoints)</span>    <span class="token class-name">RowData</span><span class="token punctuation">(</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token comment">// Row-level change</span>    <span class="token class-name">IndexEntry</span> <span class="token punctuation">&#123;</span> key<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span> value<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span> <span class="token punctuation">&#125;</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Physical layout on disk:</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ WAL Segment File (e.g., 000000010000000000000001)           │├─────────────────────────────────────────────────────────────┤│ PageHeader (16 bytes)                                       │├─────────────────────────────────────────────────────────────┤│ Record 1:                                                   ││   ├─ Length (4 bytes)                                       ││   ├─ LSN (8 bytes)                                          ││   ├─ Prev LSN (8 bytes)                                     ││   ├─ Transaction ID (4 bytes)                               ││   ├─ Record Type (1 byte)                                   ││   ├─ Page ID (8 bytes)                                      ││   ├─ Offset (2 bytes)                                       ││   ├─ Data Length (4 bytes)                                  ││   ├─ Data (variable)                                        ││   └─ Checksum (4 bytes)                                     │├─────────────────────────────────────────────────────────────┤│ Record 2:                                                   ││   ...                                                       │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="Physical-vs-Logical-WAL">Physical vs. Logical WAL</h3><p><strong>PostgreSQL uses physical WAL</strong> (page-level changes):</p><table><thead><tr><th>Type</th><th>What’s Logged</th><th>Pros</th><th>Cons</th></tr></thead><tbody><tr><td><strong>Physical</strong></td><td>Byte ranges on pages</td><td>Simple replay, exact changes</td><td>Larger logs, page-format dependent</td></tr><tr><td><strong>Logical</strong></td><td>SQL operations (INSERT/UPDATE)</td><td>Smaller logs, format-independent</td><td>Complex replay, must re-execute</td></tr></tbody></table><p><strong>Vaultgres approach:</strong> Physical WAL for simplicity (matching PostgreSQL):</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">PhysicalWalRecord</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token class-name">PageId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> length<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> before_image<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">>></span><span class="token punctuation">,</span>  <span class="token comment">// For undo</span>    <span class="token keyword">pub</span> after_image<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>            <span class="token comment">// For redo</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="3-WAL-Manager-Implementation">3 WAL Manager Implementation</h2><h3 id="Writing-WAL-Records">Writing WAL Records</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/manager.rs</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">File</span><span class="token punctuation">,</span> <span class="token class-name">OpenOptions</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>io<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">Write</span><span class="token punctuation">,</span> <span class="token class-name">Seek</span><span class="token punctuation">,</span> <span class="token class-name">SeekFrom</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">std<span class="token punctuation">::</span>path<span class="token punctuation">::</span></span><span class="token class-name">PathBuf</span><span class="token punctuation">;</span><span class="token keyword">use</span> <span class="token namespace">parking_lot<span class="token punctuation">::</span></span><span class="token class-name">Mutex</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalManager</span> <span class="token punctuation">&#123;</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    current_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    current_file<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token class-name">File</span><span class="token operator">></span><span class="token punctuation">,</span>    current_offset<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token keyword">u32</span><span class="token operator">></span><span class="token punctuation">,</span>    flush_lsn<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token operator">></span><span class="token punctuation">,</span>    last_lsn<span class="token punctuation">:</span> <span class="token class-name">AtomicU64</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token function">create_dir_all</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> manager <span class="token operator">=</span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">::</span><span class="token function">from</span><span class="token punctuation">(</span>wal_dir<span class="token punctuation">)</span><span class="token punctuation">,</span>            current_segment<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>            current_file<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">create</span><span class="token punctuation">(</span><span class="token string">""</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Placeholder</span>            current_offset<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            flush_lsn<span class="token punctuation">:</span> <span class="token class-name">Mutex</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token constant">INVALID</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            last_lsn<span class="token punctuation">:</span> <span class="token class-name">AtomicU64</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        manager<span class="token punctuation">.</span><span class="token function">open_or_create_segment</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>manager<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">open_or_create_segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">,</span> segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> file <span class="token operator">=</span> <span class="token class-name">OpenOptions</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">create</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span><span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=</span> file<span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment <span class="token operator">=</span> segment<span class="token punctuation">;</span>        <span class="token operator">*</span><span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Assign new LSN</span>        <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>last_lsn<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Serialize record</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> buffer <span class="token operator">=</span> <span class="token class-name">Vec</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">serialize_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> lsn<span class="token punctuation">,</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Check if we need a new segment</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> offset <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">if</span> <span class="token operator">*</span>offset <span class="token operator">+</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span> <span class="token operator">></span> <span class="token constant">SEGMENT_SIZE</span> <span class="token punctuation">&#123;</span>            <span class="token function">drop</span><span class="token punctuation">(</span>offset<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">rotate_segment</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            offset <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Write to file</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> file <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        file<span class="token punctuation">.</span><span class="token function">seek</span><span class="token punctuation">(</span><span class="token class-name">SeekFrom</span><span class="token punctuation">::</span><span class="token class-name">Start</span><span class="token punctuation">(</span><span class="token operator">*</span>offset <span class="token keyword">as</span> <span class="token keyword">u64</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        file<span class="token punctuation">.</span><span class="token function">write_all</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span>offset <span class="token operator">+=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">flush</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> target_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> flush_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>flush_lsn<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// Already flushed?</span>        <span class="token keyword">if</span> <span class="token operator">*</span>flush_lsn <span class="token operator">>=</span> target_lsn <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Sync to disk</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>current_file<span class="token punctuation">.</span><span class="token function">lock</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">sync_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token operator">*</span>flush_lsn <span class="token operator">=</span> target_lsn<span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Reading-WAL-Records">Reading WAL Records</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">read_from</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> start_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalIterator</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> segment <span class="token operator">=</span> start_lsn<span class="token punctuation">.</span><span class="token function">segment</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> offset <span class="token operator">=</span> start_lsn<span class="token punctuation">.</span><span class="token function">offset</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> file <span class="token operator">=</span> <span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">WalIterator</span> <span class="token punctuation">&#123;</span>            current_file<span class="token punctuation">:</span> file<span class="token punctuation">,</span>            current_segment<span class="token punctuation">:</span> segment<span class="token punctuation">,</span>            current_offset<span class="token punctuation">:</span> offset<span class="token punctuation">,</span>            wal_dir<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalIterator</span> <span class="token punctuation">&#123;</span>    current_file<span class="token punctuation">:</span> <span class="token class-name">File</span><span class="token punctuation">,</span>    current_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    current_offset<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Iterator</span> <span class="token keyword">for</span> <span class="token class-name">WalIterator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">type</span> <span class="token type-definition class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span><span class="token punctuation">;</span>    <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token keyword">Self</span><span class="token punctuation">::</span><span class="token class-name">Item</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Try to read record at current position</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_record</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Some</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Ok</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">None</span><span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// End of segment, try next segment</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_offset <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> path <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>current_file <span class="token operator">=</span> <span class="token class-name">File</span><span class="token punctuation">::</span><span class="token function">open</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>path<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">read_record</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>r<span class="token closure-punctuation punctuation">|</span></span> r<span class="token punctuation">.</span><span class="token function">ok</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">flatten</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">Err</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Err</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="4-Checkpoints-Limiting-Recovery-Time">4 Checkpoints: Limiting Recovery Time</h2><h3 id="The-Problem-Unbounded-Replay">The Problem: Unbounded Replay</h3><p><strong>Without checkpoints:</strong></p><pre class="language-none"><code class="language-none">Day 1: Database createdDay 30: Crash!Recovery: Replay 30 days of WAL records 😱</code></pre><p><strong>Solution:</strong> Periodic checkpoints.</p><hr /><h3 id="Checkpoint-Process">Checkpoint Process</h3><pre class="language-MERMAID_BASE64_616" data-language="MERMAID_BASE64_616"><code class="language-MERMAID_BASE64_616">c2VxdWVuY2VEaWFncmFtCiAgICBwYXJ0aWNpcGFudCBEQiBhcyBEYXRhYmFzZQogICAgcGFydGljaXBhbnQgV0FMIGFzIFdBTCBNYW5hZ2VyCiAgICBwYXJ0aWNpcGFudCBCUCBhcyBCdWZmZXIgUG9vbAogICAgcGFydGljaXBhbnQgQ0tQVCBhcyBDaGVja3BvaW50IEZpbGUKCiAgICBEQi0+PkRCOiAxLiBXcml0ZSBDSEVDS1BPSU5UX0JFR0lOIHJlY29yZAogICAgREItPj5CUDogMi4gRmx1c2ggYWxsIGRpcnR5IHBhZ2VzCiAgICBCUC0tPj5EQjogUGFnZXMgd3JpdHRlbiB0byBkaXNrCiAgICBEQi0+PkRCOiAzLiBXcml0ZSBDSEVDS1BPSU5UX0VORCByZWNvcmQKICAgIERCLT4+V0FMOiA0LiBGbHVzaCBXQUwgdG8gZGlzawogICAgREItPj5DS1BUOiA1LiBTYXZlIGNoZWNrcG9pbnQgTFNOCiAgICBDS1BULS0+PkRCOiBDaGVja3BvaW50IGNvbXBsZXRl</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/checkpoint.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> checkpoint_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> active_transactions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> dirty_pages<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// page_id → page_lsn</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">create_checkpoint</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Log checkpoint begin</span>        <span class="token keyword">let</span> begin_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. Flush all dirty pages (this is the expensive part!)</span>        buffer_pool<span class="token punctuation">.</span><span class="token function">flush_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 3. Get current state</span>        <span class="token keyword">let</span> checkpoint <span class="token operator">=</span> <span class="token class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>            checkpoint_lsn<span class="token punctuation">:</span> begin_lsn<span class="token punctuation">,</span>            active_transactions<span class="token punctuation">:</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_active_transactions</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            dirty_pages<span class="token punctuation">:</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_dirty_pages</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// 4. Log checkpoint end</span>        <span class="token keyword">let</span> end_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_end</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 5. Flush WAL</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 6. Save checkpoint to known location</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">save_checkpoint_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Fuzzy-Checkpoints-PostgreSQL-Style">Fuzzy Checkpoints (PostgreSQL Style)</h3><p><strong>Sharp checkpoint:</strong> Blocks all writes during checkpoint. Simple but causes pauses.</p><p><strong>Fuzzy checkpoint:</strong> Allows writes during checkpoint. Complex but no pauses.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Fuzzy checkpoint approach</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">create_fuzzy_checkpoint</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Record checkpoint start LSN</span>    <span class="token keyword">let</span> start_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Write CHECKPOINT_BEGIN (don't block)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_begin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Get list of dirty pages (snapshot)</span>    <span class="token keyword">let</span> dirty_pages <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_dirty_pages_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 4. Flush dirty pages in background (don't block new writes)</span>    <span class="token keyword">let</span> checkpoint_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>page_id<span class="token punctuation">,</span> page_lsn<span class="token punctuation">)</span> <span class="token keyword">in</span> dirty_pages <span class="token punctuation">&#123;</span>        <span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> checkpoint_lsn <span class="token punctuation">&#123;</span>            buffer_pool<span class="token punctuation">.</span><span class="token function">flush_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// 5. Write CHECKPOINT_END with final LSN</span>    <span class="token keyword">let</span> end_lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">current_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">checkpoint_end_at</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>end_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">Checkpoint</span> <span class="token punctuation">&#123;</span>        checkpoint_lsn<span class="token punctuation">:</span> end_lsn<span class="token punctuation">,</span>        <span class="token comment">// ...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="5-ARIES-The-Recovery-Algorithm">5 ARIES: The Recovery Algorithm</h2><h3 id="The-Three-Phases">The Three Phases</h3><p>ARIES = <strong>A</strong>lgorithm for <strong>R</strong>ecovery and <strong>I</strong>solation <strong>E</strong>xploiting <strong>S</strong>emantics</p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│                    ARIES Recovery                           │├─────────────────────────────────────────────────────────────┤│ Phase 1: ANALYSIS                                           ││ - Scan WAL from last checkpoint                             ││ - Determine which transactions were active at crash         ││ - Build dirty page table                                    │├─────────────────────────────────────────────────────────────┤│ Phase 2: REDO                                               ││ - Replay ALL logged changes from analysis end               ││ - Bring database to exact crash state                       ││ - Skip pages already on disk (using page LSN)               │├─────────────────────────────────────────────────────────────┤│ Phase 3: UNDO                                               ││ - Roll back all uncommitted transactions                    ││ - Write Compensation Log Records (CLRs)                     ││ - Database is now consistent                                │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="Phase-1-Analysis">Phase 1: Analysis</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/analysis.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AnalysisResult</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> transactions_at_crash<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> dirty_page_table<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// page → first LSN that dirtied it</span>    <span class="token keyword">pub</span> redo_start_lsn<span class="token punctuation">:</span> <span class="token class-name">Lsn</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">analyze</span><span class="token punctuation">(</span>wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalManager</span><span class="token punctuation">,</span> checkpoint<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Checkpoint</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> txn_status<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> dirty_page_table<span class="token punctuation">:</span> <span class="token class-name">HashMap</span><span class="token operator">&lt;</span><span class="token class-name">PageId</span><span class="token punctuation">,</span> <span class="token class-name">Lsn</span><span class="token operator">></span> <span class="token operator">=</span> <span class="token class-name">HashMap</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Initialize from checkpoint</span>    <span class="token keyword">for</span> txn <span class="token keyword">in</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">.</span>active_transactions <span class="token punctuation">&#123;</span>        txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span><span class="token operator">*</span>txn<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">for</span> <span class="token punctuation">(</span>page_id<span class="token punctuation">,</span> page_lsn<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">.</span>dirty_pages <span class="token punctuation">&#123;</span>        dirty_page_table<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span><span class="token operator">*</span>page_id<span class="token punctuation">,</span> <span class="token operator">*</span>page_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Scan WAL from checkpoint</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> iterator <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">read_from</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record_result <span class="token keyword">in</span> iterator <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> record <span class="token operator">=</span> record_result<span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">match</span> record<span class="token punctuation">.</span>record_type <span class="token punctuation">&#123;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Begin</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Commit</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Committed</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Abort</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                txn_status<span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>transaction_id<span class="token punctuation">,</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Aborted</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Track first LSN that dirtied each page</span>                dirty_page_table<span class="token punctuation">.</span><span class="token function">entry</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span>                    <span class="token punctuation">.</span><span class="token function">or_insert</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Find minimum redo LSN</span>    <span class="token keyword">let</span> redo_start_lsn <span class="token operator">=</span> dirty_page_table<span class="token punctuation">.</span><span class="token function">values</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">min</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">copied</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">unwrap_or</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">AnalysisResult</span> <span class="token punctuation">&#123;</span>        transactions_at_crash<span class="token punctuation">:</span> txn_status<span class="token punctuation">,</span>        dirty_page_table<span class="token punctuation">,</span>        redo_start_lsn<span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Phase-2-Redo">Phase 2: Redo</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/redo.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">redo</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    analysis<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> <span class="token keyword">mut</span> iterator <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">read_from</span><span class="token punctuation">(</span>analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record_result <span class="token keyword">in</span> iterator <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> record <span class="token operator">=</span> record_result<span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Only redo committed transactions' changes</span>        <span class="token comment">// (We redo ALL changes first, undo uncommitted later)</span>        <span class="token keyword">match</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>record_type <span class="token punctuation">&#123;</span>            <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Insert</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Update</span> <span class="token operator">|</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Delete</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Check if page needs redo</span>                <span class="token keyword">let</span> page <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                <span class="token keyword">let</span> page_lsn <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Only apply if page is older than this LSN</span>                <span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> record<span class="token punctuation">.</span>lsn <span class="token punctuation">&#123;</span>                    <span class="token function">apply_redo</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> <span class="token operator">&amp;</span>page<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>                    page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>                <span class="token comment">// Else: page already has this change (written before crash)</span>            <span class="token punctuation">&#125;</span>            _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">apply_redo</span><span class="token punctuation">(</span>record<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> page<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Page</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">match</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>data <span class="token punctuation">&#123;</span>        <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">PageImage</span><span class="token punctuation">(</span>image<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Full page image (from checkpoint)</span>            page<span class="token punctuation">.</span><span class="token function">copy_from_slice</span><span class="token punctuation">(</span>image<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">RowData</span><span class="token punctuation">(</span>data<span class="token punctuation">)</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Partial page update</span>            page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> data<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        _ <span class="token operator">=></span> <span class="token punctuation">&#123;</span><span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Key insight:</strong> Redo is <strong>idempotent</strong>. Running it multiple times produces the same result.</p><hr /><h3 id="Phase-3-Undo">Phase 3: Undo</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/undo.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">undo</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    analysis<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">AnalysisResult</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Find transactions to undo (active at crash, not committed)</span>    <span class="token keyword">let</span> losers<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span> <span class="token operator">=</span> analysis<span class="token punctuation">.</span>transactions_at_crash        <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> status<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span><span class="token operator">*</span>status <span class="token operator">==</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">map</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>txn<span class="token punctuation">,</span> _<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span>txn<span class="token punctuation">)</span>        <span class="token punctuation">.</span><span class="token function">collect</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Undo in reverse order (LIFO - Last Committed, First Undone)</span>    <span class="token keyword">for</span> txn_id <span class="token keyword">in</span> losers<span class="token punctuation">.</span><span class="token function">into_iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">rev</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token function">undo_transaction</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> txn_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token keyword">fn</span> <span class="token function-definition function">undo_transaction</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Find all records for this transaction (in reverse)</span>    <span class="token keyword">let</span> records <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">get_transaction_records</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">for</span> record <span class="token keyword">in</span> records<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">rev</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Write Compensation Log Record (CLR)</span>        <span class="token keyword">let</span> clr <span class="token operator">=</span> <span class="token class-name">WalRecord</span> <span class="token punctuation">&#123;</span>            lsn<span class="token punctuation">:</span> wal<span class="token punctuation">.</span><span class="token function">next_lsn</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            prev_lsn<span class="token punctuation">:</span> record<span class="token punctuation">.</span>lsn<span class="token punctuation">,</span>            transaction_id<span class="token punctuation">:</span> txn_id<span class="token punctuation">,</span>            record_type<span class="token punctuation">:</span> <span class="token class-name">WalRecordType</span><span class="token punctuation">::</span><span class="token class-name">Compensation</span> <span class="token punctuation">&#123;</span>                undo_next_lsn<span class="token punctuation">:</span> record<span class="token punctuation">.</span>prev_lsn<span class="token punctuation">,</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">,</span>            page_id<span class="token punctuation">:</span> record<span class="token punctuation">.</span>page_id<span class="token punctuation">,</span>            offset<span class="token punctuation">:</span> record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span>            data<span class="token punctuation">:</span> <span class="token class-name">WalData</span><span class="token punctuation">::</span><span class="token class-name">RowData</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>before_image<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">unwrap_or_default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            checksum<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Calculate checksum</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> clr_lsn <span class="token operator">=</span> wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>clr<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Apply undo (restore before_image)</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>before_image<span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token operator">&amp;</span>record<span class="token punctuation">.</span>before_image <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> page <span class="token operator">=</span> buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            page<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span>record<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> before_image<span class="token punctuation">)</span><span class="token punctuation">;</span>            page<span class="token punctuation">.</span><span class="token function">set_lsn</span><span class="token punctuation">(</span>clr_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Log transaction abort</span>    wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span><span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">abort</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    wal<span class="token punctuation">.</span><span class="token function">flush_all</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Complete-Recovery-Process">Complete Recovery Process</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/mod.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">recover</span><span class="token punctuation">(</span>    wal<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">WalManager</span><span class="token punctuation">,</span>    buffer_pool<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">BufferPool</span><span class="token punctuation">,</span>    data_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">RecoveryStats</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Find last checkpoint</span>    <span class="token keyword">let</span> checkpoint <span class="token operator">=</span> <span class="token function">find_last_checkpoint</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> data_dir<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Starting recovery from checkpoint LSN &#123;&#125;"</span><span class="token punctuation">,</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Phase 1: Analysis</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 1: Analysis..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> analysis <span class="token operator">=</span> <span class="token function">analyze</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> <span class="token operator">&amp;</span>checkpoint<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Found &#123;&#125; active transactions at crash"</span><span class="token punctuation">,</span>             analysis<span class="token punctuation">.</span>transactions_at_crash<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Redo will start from LSN &#123;&#125;"</span><span class="token punctuation">,</span> analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Phase 2: Redo</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 2: Redo..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token function">redo</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> <span class="token operator">&amp;</span>analysis<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Redo complete"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 4. Phase 3: Undo</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Phase 3: Undo..."</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token function">undo</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> buffer_pool<span class="token punctuation">,</span> <span class="token operator">&amp;</span>analysis<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"  Undo complete"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 5. Truncate old WAL (optional)</span>    <span class="token function">truncate_wal_before</span><span class="token punctuation">(</span>wal<span class="token punctuation">,</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token class-name">RecoveryStats</span> <span class="token punctuation">&#123;</span>        checkpoint_lsn<span class="token punctuation">:</span> checkpoint<span class="token punctuation">.</span>checkpoint_lsn<span class="token punctuation">,</span>        redo_start_lsn<span class="token punctuation">:</span> analysis<span class="token punctuation">.</span>redo_start_lsn<span class="token punctuation">,</span>        transactions_aborted<span class="token punctuation">:</span> analysis<span class="token punctuation">.</span>transactions_at_crash            <span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">filter</span><span class="token punctuation">(</span><span class="token closure-params"><span class="token closure-punctuation punctuation">|</span><span class="token punctuation">(</span>_<span class="token punctuation">,</span> s<span class="token punctuation">)</span><span class="token closure-punctuation punctuation">|</span></span> <span class="token operator">*</span><span class="token operator">*</span>s <span class="token operator">==</span> <span class="token class-name">TxnStatus</span><span class="token punctuation">::</span><span class="token class-name">Active</span><span class="token punctuation">)</span>            <span class="token punctuation">.</span><span class="token function">count</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-Recovery-Example-Step-by-Step">6 Recovery Example: Step by Step</h2><h3 id="Crash-Scenario">Crash Scenario</h3><pre class="language-none"><code class="language-none">Time    LSN    Transaction    Action─────────────────────────────────────────────────────10:00   100    CKPT           Checkpoint created10:01   101    T1 (xid&#x3D;1)     BEGIN10:02   102    T1             INSERT row A (balance&#x3D;100)10:03   103    T2 (xid&#x3D;2)     BEGIN10:04   104    T2             INSERT row B (balance&#x3D;200)10:05   105    T1             COMMIT10:06   106    T2             UPDATE row B (balance&#x3D;250)10:07   ─────  ⚡ CRASH ⚡</code></pre><p><strong>State at crash:</strong></p><ul><li>T1: Committed (LSN 105)</li><li>T2: Active (not committed)</li><li>Dirty pages: A (LSN 102), B (LSN 106)</li></ul><hr /><h3 id="Recovery-Execution">Recovery Execution</h3><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ Phase 1: ANALYSIS                                           │├─────────────────────────────────────────────────────────────┤│ Start from checkpoint LSN 100                               ││ Scan records 100-106                                        ││                                                             ││ Result:                                                     ││   - T1: Committed (LSN 105)                                 ││   - T2: Active (loser!)                                     ││   - Dirty pages: A→102, B→106                               ││   - Redo start: LSN 102                                     │└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐│ Phase 2: REDO                                               │├─────────────────────────────────────────────────────────────┤│ Replay from LSN 102:                                        ││                                                             ││ LSN 102: INSERT row A                                       ││   → Check page A LSN                                        ││   → If page LSN &lt; 102: apply insert                         ││   → Else: skip (already on disk)                            ││                                                             ││ LSN 104: INSERT row B                                       ││   → Apply if needed                                         ││                                                             ││ LSN 106: UPDATE row B                                       ││   → Apply if needed                                         ││                                                             ││ Result: Database at exact crash state                       │└─────────────────────────────────────────────────────────────┘┌─────────────────────────────────────────────────────────────┐│ Phase 3: UNDO                                               │├─────────────────────────────────────────────────────────────┤│ Loser transactions: T2                                      ││                                                             ││ Undo T2 (in reverse order):                                 ││   1. Undo LSN 106 (UPDATE B: 200→250)                       ││      → Write CLR: undo_next_lsn &#x3D; 104                       ││      → Restore B to balance&#x3D;200                             ││                                                             ││   2. Undo LSN 104 (INSERT B)                                ││      → Write CLR: undo_next_lsn &#x3D; 103                       ││      → Delete row B                                         ││                                                             ││   3. Log T2 ABORT                                           ││                                                             ││ Result: T2&#39;s changes rolled back, T1&#39;s changes preserved    │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h2 id="7-WAL-Archiving-and-Point-in-Time-Recovery">7 WAL Archiving and Point-in-Time Recovery</h2><h3 id="WAL-Archiving">WAL Archiving</h3><p><strong>Continuous archiving:</strong> Copy WAL segments to safe storage before reuse.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/wal/archiver.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalArchiver</span> <span class="token punctuation">&#123;</span>    wal_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    archive_dir<span class="token punctuation">:</span> <span class="token class-name">PathBuf</span><span class="token punctuation">,</span>    archive_timeout<span class="token punctuation">:</span> <span class="token class-name">Duration</span><span class="token punctuation">,</span>    last_archived_segment<span class="token punctuation">:</span> <span class="token keyword">u32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">WalArchiver</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">archive_ready_segments</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Find completed segments (can't be overwritten)</span>        <span class="token keyword">for</span> segment <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_ready_segments</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> src <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> dst <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>archive_dir<span class="token punctuation">.</span><span class="token function">join</span><span class="token punctuation">(</span><span class="token macro property">format!</span><span class="token punctuation">(</span><span class="token string">"&#123;:024X&#125;.backup"</span><span class="token punctuation">,</span> segment<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// Copy to archive (could be remote storage like S3)</span>            <span class="token namespace">std<span class="token punctuation">::</span>fs<span class="token punctuation">::</span></span><span class="token function">copy</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>src<span class="token punctuation">,</span> <span class="token operator">&amp;</span>dst<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>last_archived_segment <span class="token operator">=</span> segment<span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Point-in-Time-Recovery-PITR">Point-in-Time Recovery (PITR)</h3><pre class="language-none"><code class="language-none">Goal: Restore database to state at 2026-03-22 14:30:001. Restore base backup from 2026-03-22 00:00:002. Replay WAL segments from archive3. Stop replay at target time (14:30:00)4. Database restored to exact point in time</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/recovery/pitr.rs</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">recover_to_point_in_time</span><span class="token punctuation">(</span>    base_backup<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>    archive_dir<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">str</span><span class="token punctuation">,</span>    target_time<span class="token punctuation">:</span> <span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">DateTime</span><span class="token operator">&lt;</span><span class="token namespace">chrono<span class="token punctuation">::</span></span><span class="token class-name">Utc</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">RecoveryError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Restore base backup</span>    <span class="token function">restore_base_backup</span><span class="token punctuation">(</span>base_backup<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Find WAL segments to replay</span>    <span class="token keyword">let</span> segments <span class="token operator">=</span> <span class="token function">find_wal_segments_for_time_range</span><span class="token punctuation">(</span>archive_dir<span class="token punctuation">,</span> target_time<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Replay WAL up to target time</span>    <span class="token keyword">for</span> segment <span class="token keyword">in</span> segments <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> records <span class="token operator">=</span> <span class="token function">read_wal_segment</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>segment<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> record <span class="token keyword">in</span> records <span class="token punctuation">&#123;</span>            <span class="token comment">// Check if we've passed target time</span>            <span class="token keyword">if</span> record<span class="token punctuation">.</span>timestamp <span class="token operator">></span> target_time <span class="token punctuation">&#123;</span>                <span class="token macro property">println!</span><span class="token punctuation">(</span><span class="token string">"Reached target time, stopping recovery"</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token keyword">return</span> <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token function">apply_redo_record</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="8-Challenges-Building-in-Rust">8 Challenges Building in Rust</h2><h3 id="Challenge-1-fsync-and-Durability">Challenge 1: fsync and Durability</h3><p><strong>Problem:</strong> Rust’s <code>File::sync_all()</code> is correct, but easy to forget.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Missing fsync - data NOT durable!</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> record <span class="token operator">=</span> <span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">commit</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token comment">// Forgot to flush!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ Correct</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn_id<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">Error</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> record <span class="token operator">=</span> <span class="token class-name">WalRecord</span><span class="token punctuation">::</span><span class="token function">commit</span><span class="token punctuation">(</span>txn_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">append</span><span class="token punctuation">(</span>record<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">flush</span><span class="token punctuation">(</span>lsn<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// fsync!</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Lesson:</strong> Wrap WAL operations in safe abstractions that enforce flushing.</p><hr /><h3 id="Challenge-2-LSN-Ordering-and-Concurrency">Challenge 2: LSN Ordering and Concurrency</h3><p><strong>Problem:</strong> Multiple threads appending WAL records must get monotonically increasing LSNs.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition - LSNs not ordered!</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Lsn</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>current_lsn <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">;</span>  <span class="token comment">// Not atomic!</span>    <span class="token keyword">self</span><span class="token punctuation">.</span>current_lsn <span class="token operator">=</span> lsn<span class="token punctuation">;</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span><span class="token comment">// ✅ Correct - atomic LSN allocation</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">append</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> record<span class="token punctuation">:</span> <span class="token class-name">WalRecord</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">Lsn</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> lsn <span class="token operator">=</span> <span class="token class-name">Lsn</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>last_lsn<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-3-Partial-Writes-and-Checksums">Challenge 3: Partial Writes and Checksums</h3><p><strong>Problem:</strong> Crash during WAL write = partial record on disk.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Solution: Checksums + length prefix</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">serialize_record</span><span class="token punctuation">(</span>record<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> buffer<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> data_start <span class="token operator">=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Write placeholder for length (fill in later)</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token number">0u8</span><span class="token punctuation">;</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Write record fields</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">.</span>lsn<span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// ... more fields ...</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">.</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Calculate checksum over everything except checksum field</span>    <span class="token keyword">let</span> checksum <span class="token operator">=</span> <span class="token namespace">crc32<span class="token punctuation">::</span></span><span class="token function">calculate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">[</span>data_start <span class="token operator">+</span> <span class="token number">4</span><span class="token punctuation">..</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    buffer<span class="token punctuation">.</span><span class="token function">extend_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>checksum<span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Fill in length</span>    <span class="token keyword">let</span> total_len <span class="token operator">=</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">-</span> data_start<span class="token punctuation">;</span>    buffer<span class="token punctuation">[</span>data_start<span class="token punctuation">..</span>data_start <span class="token operator">+</span> <span class="token number">4</span><span class="token punctuation">]</span>        <span class="token punctuation">.</span><span class="token function">copy_from_slice</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">(</span>total_len <span class="token keyword">as</span> <span class="token keyword">u32</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">to_le_bytes</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">deserialize_record</span><span class="token punctuation">(</span>buffer<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">WalRecord</span><span class="token punctuation">,</span> <span class="token class-name">WalError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Read length</span>    <span class="token keyword">let</span> length <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_le_bytes</span><span class="token punctuation">(</span>buffer<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">..</span><span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">try_into</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">)</span> <span class="token keyword">as</span> <span class="token keyword">usize</span><span class="token punctuation">;</span>    <span class="token comment">// Verify we have enough data</span>    <span class="token keyword">if</span> buffer<span class="token punctuation">.</span><span class="token function">len</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> length <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">WalError</span><span class="token punctuation">::</span><span class="token class-name">PartialWrite</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// Verify checksum</span>    <span class="token keyword">let</span> stored_checksum <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">::</span><span class="token function">from_le_bytes</span><span class="token punctuation">(</span>        buffer<span class="token punctuation">[</span>length <span class="token operator">-</span> <span class="token number">4</span><span class="token punctuation">..</span>length<span class="token punctuation">]</span><span class="token punctuation">.</span><span class="token function">try_into</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span>    <span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">let</span> calculated <span class="token operator">=</span> <span class="token namespace">crc32<span class="token punctuation">::</span></span><span class="token function">calculate</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>buffer<span class="token punctuation">[</span><span class="token number">4</span><span class="token punctuation">..</span>length <span class="token operator">-</span> <span class="token number">4</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">if</span> stored_checksum <span class="token operator">!=</span> calculated <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">WalError</span><span class="token punctuation">::</span><span class="token class-name">ChecksumMismatch</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// ... parse record ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="9-How-AI-Accelerated-This">9 How AI Accelerated This</h2><h3 id="What-AI-Got-Right">What AI Got Right</h3><table><thead><tr><th>Task</th><th>AI Contribution</th></tr></thead><tbody><tr><td><strong>ARIES phases</strong></td><td>Explained analysis/redo/undo clearly</td></tr><tr><td><strong>LSN structure</strong></td><td>Suggested segment/offset encoding</td></tr><tr><td><strong>Checkpoint design</strong></td><td>Outlined fuzzy vs. sharp trade-offs</td></tr><tr><td><strong>CLR records</strong></td><td>Explained compensation log record purpose</td></tr></tbody></table><hr /><h3 id="What-AI-Got-Wrong">What AI Got Wrong</h3><table><thead><tr><th>Issue</th><th>What Happened</th></tr></thead><tbody><tr><td><strong>Redo logic</strong></td><td>First draft redid only committed txns (wrong! Redo ALL, then undo)</td></tr><tr><td><strong>Undo order</strong></td><td>Suggested forward order instead of reverse (LIFO)</td></tr><tr><td><strong>Page LSN</strong></td><td>Missed that page LSN is used to skip redundant redos</td></tr></tbody></table><p><strong>Pattern:</strong> ARIES is subtle. The “redo all, undo some” insight is counterintuitive.</p><hr /><h3 id="Example-Understanding-Redo-Philosophy">Example: Understanding Redo Philosophy</h3><p><strong>My question to AI:</strong></p><blockquote><p>“Why does ARIES redo uncommitted transactions? Shouldn’t we only redo committed ones?”</p></blockquote><p><strong>What I learned:</strong></p><ol><li><strong>Redo phase:</strong> Bring database to exact crash state (including uncommitted changes)</li><li><strong>Undo phase:</strong> Roll back uncommitted transactions</li><li><strong>Why?</strong> Simpler than tracking dependencies during redo</li><li><strong>Key insight:</strong> Redo is idempotent, undo must be logged (CLRs)</li></ol><p><strong>Result:</strong> Correct redo implementation:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Redo ALL records, not just committed</span><span class="token keyword">if</span> page_lsn <span class="token operator">&lt;</span> record<span class="token punctuation">.</span>lsn <span class="token punctuation">&#123;</span>    <span class="token function">apply_redo</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>record<span class="token punctuation">,</span> <span class="token operator">&amp;</span>page<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>  <span class="token comment">// Apply regardless of txn status</span><span class="token punctuation">&#125;</span><span class="token comment">// Undo phase will handle uncommitted transactions</span></code></pre><hr /><h2 id="Summary-WAL-and-ARIES-in-One-Diagram">Summary: WAL and ARIES in One Diagram</h2><pre class="language-MERMAID_BASE64_617" data-language="MERMAID_BASE64_617"><code class="language-MERMAID_BASE64_617">Zmxvd2NoYXJ0IFRECiAgICBzdWJncmFwaCAiTm9ybWFsIE9wZXJhdGlvbiIKICAgICAgICBBW1RyYW5zYWN0aW9uXSAtLT4gQltXcml0ZSBXQUwgUmVjb3JkXQogICAgICAgIEIgLS0+IENbRmx1c2ggV0FMIGZzeW5jXQogICAgICAgIEMgLS0+IERbTW9kaWZ5IFBhZ2VdCiAgICAgICAgRCAtLT4gRVtNYXJrIERpcnR5XQogICAgICAgIEUgLS0+IEZbQUNLIHRvIENsaWVudF0KICAgICAgICBGIC0tPiBHW0NoZWNrcG9pbnQgTGF0ZXJdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiQ3Jhc2ggUmVjb3ZlcnkiCiAgICAgICAgSFvimqEgQ1JBU0gg4pqhXSAtLT4gSVtSZXN0YXJ0IERhdGFiYXNlXQogICAgICAgIEkgLS0+IEpbUGhhc2UgMTogQW5hbHlzaXNdCiAgICAgICAgSiAtLT4gS1tGaW5kIEFjdGl2ZSBUcmFuc2FjdGlvbnNdCiAgICAgICAgSyAtLT4gTFtQaGFzZSAyOiBSZWRvXQogICAgICAgIEwgLS0+IE1bUmVwbGF5IEFsbCBXQUwgZnJvbSBDaGVja3BvaW50XQogICAgICAgIE0gLS0+IE5bUGhhc2UgMzogVW5kb10KICAgICAgICBOIC0tPiBPW1JvbGxiYWNrIExvc2VyIFRyYW5zYWN0aW9uc10KICAgICAgICBPIC0tPiBQW0RhdGFiYXNlIENvbnNpc3RlbnRdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiV0FMIFN0cnVjdHVyZSIKICAgICAgICBRW1dBTCBTZWdtZW50IDFdIC0tPiBSW1dBTCBTZWdtZW50IDJdCiAgICAgICAgUiAtLT4gU1tXQUwgU2VnbWVudCAzXQogICAgICAgIFRbQ2hlY2twb2ludCBSZWNvcmRdIC0uLT4gUQogICAgZW5kCgogICAgc3R5bGUgQyBmaWxsOiNmZmYzZTAsc3Ryb2tlOiNmNTdjMDAKICAgIHN0eWxlIEogZmlsbDojZTNmMmZkLHN0cm9rZTojMTk3NmQyCiAgICBzdHlsZSBMIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgTiBmaWxsOiNlM2YyZmQsc3Ryb2tlOiMxOTc2ZDIKICAgIHN0eWxlIFAgZmlsbDojZThmNWU5LHN0cm9rZTojMzg4ZTNj</code></pre><p><strong>Key Takeaways:</strong></p><table><thead><tr><th>Concept</th><th>Why It Matters</th></tr></thead><tbody><tr><td><strong>WAL</strong></td><td>Durability without sacrificing performance</td></tr><tr><td><strong>LSN</strong></td><td>Total ordering of all changes</td></tr><tr><td><strong>Checkpoints</strong></td><td>Bound recovery time</td></tr><tr><td><strong>ARIES Analysis</strong></td><td>Determine what needs recovery</td></tr><tr><td><strong>ARIES Redo</strong></td><td>Replay to exact crash state</td></tr><tr><td><strong>ARIES Undo</strong></td><td>Roll back uncommitted work</td></tr><tr><td><strong>CLRs</strong></td><td>Idempotent undo, prevents re-undo</td></tr></tbody></table><hr /><p><strong>Further Reading:</strong></p><ul><li>“ARIES: A Transaction Recovery Method Supporting Fine Granularity Locking” by Mohan et al. (1992)</li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/transam/xlog.c"><code>src/backend/access/transam/xlog.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/transam/xlogfuncs.c"><code>src/backend/access/transam/xlogfuncs.c</code></a></li><li>“Database Management Systems” by Ramakrishnan (Ch. 17: Recovery)</li><li>“Readings in Database Systems” (Red Book) - ARIES chapter</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Part 4 of the Vaultgres journey: implementing Write-Ahead Logging and the ARIES recovery algorithm. Deep dive into durability, checkpoints, and the three-phase recovery that brings your database back from a crash.</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 构建 PostgreSQL 兼容数据库：MVCC 与事务管理</title>
    <link href="https://neo01.com/zh-CN/2026/03/Database-Rust-MVCC-Transaction-Manager/"/>
    <id>https://neo01.com/zh-CN/2026/03/Database-Rust-MVCC-Transaction-Manager/</id>
    <published>2026-03-02T16:00:00.000Z</published>
    <updated>2026-03-14T03:03:24.905Z</updated>
    
    <content type="html"><![CDATA[<p>在 <a href="/zh-CN/2026/03/Database-Rust-BPlusTree-Index-Concurrent-Access/">第二部分</a> 中，我们构建了并发 B+Tree 索引。但我们的方法有个根本问题。</p><p><strong>读者阻塞写入者。写入者阻塞读者。</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Current implementation</span><span class="token keyword">let</span> lock <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Reader acquires lock</span><span class="token comment">// Writer waits... and waits... and waits...</span><span class="token comment">// Reader still holding lock (maybe computing something expensive)</span><span class="token comment">// Writer: 😭</span></code></pre><p>这对真正的数据库来说是无法接受的。PostgreSQL 处理<strong>数千个并发事务</strong>，读者永远不会阻塞写入者。如何做到？</p><p><strong>MVCC：多版本并发控制。</strong></p><p>今天：在 Rust 中实现带快照隔离的 MVCC、事务管理，并面对事务 ID 回卷的噩梦。</p><hr /><h2 id="1-MVCC-的洞察">1 MVCC 的洞察</h2><h3 id="问题：锁定太严格">问题：锁定太严格</h3><p><strong>传统锁定（2PL）：</strong></p><pre class="language-none"><code class="language-none">Transaction A: SELECT * FROM users WHERE id &#x3D; 1  -- Reads row XTransaction B: UPDATE users SET balance &#x3D; 100 WHERE id &#x3D; 1  -- Blocked!Transaction A: (still reading, maybe for 10 seconds)Transaction B: 😡 Still blocked!</code></pre><p><strong>MVCC 方法：</strong></p><pre class="language-none"><code class="language-none">Transaction A: SELECT * FROM users WHERE id &#x3D; 1               -- 看见版本 1 的行 X（旧但一致）Transaction B: UPDATE users SET balance &#x3D; 100 WHERE id &#x3D; 1               -- 创建版本 2 的行 X               -- 不被阻塞！两个事务互不阻塞。</code></pre><hr /><h3 id="MVCC-如何运作">MVCC 如何运作</h3><p><strong>每行有多个版本：</strong></p><pre class="language-none"><code class="language-none">Row: users.id &#x3D; 1Version 1: &#123;id: 1, balance: 50,  xmin: 100, xmax: NULL&#125;           ↑                    ↑       ↑           Data                Created  Still visible                             by txn 100  (not deleted)Version 2: &#123;id: 1, balance: 100, xmin: 200, xmax: NULL&#125;           ↑                     ↑           New data          Created by txn 200Version 3: &#123;id: 1, balance: 150, xmin: 300, xmax: 400&#125;           ↑                     ↑        ↑           Old data          Created   Deleted by                             by txn 300  txn 400</code></pre><p><strong>事务元数据：</strong></p><table><thead><tr><th>字段</th><th>意义</th></tr></thead><tbody><tr><td><code>xmin</code></td><td>创建此版本的事务 ID</td></tr><tr><td><code>xmax</code></td><td>删除此版本的事务 ID（NULL = 仍然存活）</td></tr><tr><td><code>cmin</code></td><td>事务内的命令 ID（用于语句级可见性）</td></tr><tr><td><code>cmax</code></td><td>删除此版本的命令 ID</td></tr></tbody></table><hr /><h2 id="2-事务-ID-和快照">2 事务 ID 和快照</h2><h3 id="事务-ID-分配">事务 ID 分配</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/txn_id.rs</span><span class="token keyword">pub</span> <span class="token keyword">type</span> <span class="token type-definition class-name">TransactionId</span> <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INVALID_XID</span><span class="token punctuation">:</span> <span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FIRST_NORMAL_XID</span><span class="token punctuation">:</span> <span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token number">3</span><span class="token punctuation">;</span>  <span class="token comment">// 0, 1, 2 are bootstrap</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TransactionIdGenerator</span> <span class="token punctuation">&#123;</span>    next_xid<span class="token punctuation">:</span> <span class="token class-name">AtomicU32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">TransactionIdGenerator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            next_xid<span class="token punctuation">:</span> <span class="token class-name">AtomicU32</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token constant">FIRST_NORMAL_XID</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">TransactionId</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>next_xid<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">current</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">TransactionId</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>next_xid<span class="token punctuation">.</span><span class="token function">load</span><span class="token punctuation">(</span><span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>问题：</strong> <code>u32</code> 在 40 亿时会回卷。然后发生什么？</p><p><strong>答案：</strong> 灾难。我们稍后处理。</p><hr /><h3 id="快照：MVCC-的核心">快照：MVCC 的核心</h3><p><strong>快照</strong> 捕捉哪些事务是可见的：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/snapshot.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Snapshot</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// 最旧的活动事务</span>    <span class="token keyword">pub</span> xmax<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// 下一个事务 ID (任何 >= xmax 的都不可见)</span>    <span class="token keyword">pub</span> active_transactions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// 进行中的事务</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Snapshot</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">is_visible</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> row_xmax<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 行由小于 xmin 的事务创建？永远可见</span>        <span class="token keyword">if</span> row_xmin <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmin <span class="token punctuation">&#123;</span>            <span class="token comment">// 除非被大于等于 xmin 的事务删除</span>            <span class="token keyword">return</span> row_xmax<span class="token punctuation">.</span><span class="token function">map_or</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>xmax<span class="token closure-punctuation punctuation">|</span></span> xmax <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmax<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// 行由大于等于 xmax 的事务创建？永远不可见</span>        <span class="token keyword">if</span> row_xmin <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmax <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// 行由活动事务创建？不可见 (除非是我们自己的)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>active_transactions<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>row_xmin<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// 行由活动事务删除？仍然可见</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>xmax<span class="token punctuation">)</span> <span class="token operator">=</span> row_xmax <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> xmax <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmax <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span><span class="token keyword">self</span><span class="token punctuation">.</span>active_transactions<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>  <span class="token comment">// 被已提交的事务所删除</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token boolean">true</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>视觉示例：</strong></p><pre class="language-none"><code class="language-none">Time:     100    150    200    250    300    350          │      │      │      │      │      │Txn 100:  [&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D; committed &#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;]Txn 150:         [&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D; active &#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;]Txn 200:                [&#x3D;&#x3D; committed &#x3D;&#x3D;]                          ↑                    快照在此处拍摄                    xmin&#x3D;150, xmax&#x3D;251, active&#x3D;[150, 200]Visibility rules:- Row created by txn 100: VISIBLE (在快照前提交)- Row created by txn 150: NOT VISIBLE (仍在活动)- Row created by txn 200: NOT VISIBLE (在快照开始后提交)- Row created by txn 250: NOT VISIBLE (&gt;&#x3D; xmax)</code></pre><hr /><h2 id="3-带-MVCC-的行布局">3 带 MVCC 的行布局</h2><h3 id="扩展页面标头">扩展页面标头</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/storage/mvcc_page.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>transaction<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">CommandId</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">MVCC_ROW_HEADER_SIZE</span><span class="token punctuation">:</span> <span class="token keyword">usize</span> <span class="token operator">=</span> <span class="token number">16</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[repr(C)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MvccRowHeader</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// 4 bytes</span>    <span class="token keyword">pub</span> xmax<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// 4 bytes (0 = not deleted)</span>    <span class="token keyword">pub</span> cmin<span class="token punctuation">:</span> <span class="token class-name">CommandId</span><span class="token punctuation">,</span>          <span class="token comment">// 4 bytes</span>    <span class="token keyword">pub</span> cmax<span class="token punctuation">:</span> <span class="token class-name">CommandId</span><span class="token punctuation">,</span>          <span class="token comment">// 4 bytes</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MvccRow</span> <span class="token punctuation">&#123;</span>    header<span class="token punctuation">:</span> <span class="token class-name">MvccRowHeader</span><span class="token punctuation">,</span>    data<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// 可变长度</span><span class="token punctuation">&#125;</span></code></pre><p><strong>带 MVCC 的页面布局：</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ PageHeader (24 bytes)                                       │├─────────────────────────────────────────────────────────────┤│ ItemId array (4 bytes each)                                 │├─────────────────────────────────────────────────────────────┤│ 可用空间                                                    │├─────────────────────────────────────────────────────────────┤│ Row 0: &#123;xmin, xmax, cmin, cmax&#125; + data                      ││ Row 1: &#123;xmin, xmax, cmin, cmax&#125; + data                      ││ Row 2: &#123;xmin, xmax, cmin, cmax&#125; + data                      │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="插入：创建新版本">插入：创建新版本</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/mvcc_operations.rs</span><span class="token keyword">impl</span> <span class="token class-name">MvccTable</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">insert</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Transaction</span><span class="token punctuation">,</span> data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">TableError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 获取有空间的页面</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_page_for_insert</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 使用事务元数据创建行</span>        <span class="token keyword">let</span> header <span class="token operator">=</span> <span class="token class-name">MvccRowHeader</span> <span class="token punctuation">&#123;</span>            xmin<span class="token punctuation">:</span> txn<span class="token punctuation">.</span>xid<span class="token punctuation">,</span>            xmax<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Not deleted</span>            cmin<span class="token punctuation">:</span> txn<span class="token punctuation">.</span><span class="token function">current_command_id</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            cmax<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// 插入到页面</span>        <span class="token keyword">let</span> row_id <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">insert_mvcc_row</span><span class="token punctuation">(</span>header<span class="token punctuation">,</span> data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 将页面标记为脏（需要 WAL）</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span>page<span class="token punctuation">.</span><span class="token function">id</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>插入的 WAL 记录：</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalRecordInsert</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="更新：创建新版本并标记旧版本">更新：创建新版本并标记旧版本</h3><p><strong>更新实际上是删除 + 插入：</strong></p><pre class="language-MERMAID_BASE64_597" data-language="MERMAID_BASE64_597"><code class="language-MERMAID_BASE64_597">Zmxvd2NoYXJ0IFRECiAgICBBW1VQREFURSB1c2VycyBTRVQgYmFsYW5jZSA9IDEwMCBXSEVSRSBpZCA9IDFdIC0tPiBCW+Wvu+aJvuaXp+eJiOacrF0KICAgIEIgLS0+IENb5bCG5pen54mI5pys5qCH6K6w5Li65bey5Yig6ZmkXQogICAgQyAtLT4gRFvorr7nva4geG1heCA9IOW9k+WJjSB0eG5dCiAgICBEIC0tPiBFW+iuvue9riBjbWF4ID0g5b2T5YmN5ZG95LukXQogICAgRSAtLT4gRlvmj5LlhaXmlrDniYjmnKxdCiAgICBGIC0tPiBHW+aWsOeJiOacrOacieaWsOeahCB4bWluXQogICAgRyAtLT4gSFvmj5DkuqTvvJrkuKTkuKrniYjmnKzpg73lj6&#x2F;op4Hnm7TliLAgVkFDVVVNXQ&#x3D;&#x3D;</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">MvccTable</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token operator">&lt;</span><span class="token class-name">F</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Transaction</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> modify<span class="token punctuation">:</span> <span class="token class-name">F</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">TableError</span><span class="token operator">></span>    <span class="token keyword">where</span>        <span class="token class-name">F</span><span class="token punctuation">:</span> <span class="token class-name">FnOnce</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. 寻找旧版本</span>        <span class="token keyword">let</span> old_row <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_row</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. 检查可见性（只能更新可见的行）</span>        <span class="token keyword">if</span> <span class="token operator">!</span>txn<span class="token punctuation">.</span>snapshot<span class="token punctuation">.</span><span class="token function">is_visible</span><span class="token punctuation">(</span>old_row<span class="token punctuation">.</span>xmin<span class="token punctuation">,</span> old_row<span class="token punctuation">.</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">TableError</span><span class="token punctuation">::</span><span class="token class-name">RowNotVisible</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// 3. 将旧版本标记为已删除</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> old_page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        old_page<span class="token punctuation">.</span><span class="token function">set_xmax</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> txn<span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token punctuation">;</span>        old_page<span class="token punctuation">.</span><span class="token function">set_cmax</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> txn<span class="token punctuation">.</span><span class="token function">current_command_id</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// 4. 使用更新后的数据创建新版本</span>        <span class="token keyword">let</span> new_data <span class="token operator">=</span> <span class="token function">modify</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>old_row<span class="token punctuation">.</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> new_row_id <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>txn<span class="token punctuation">,</span> <span class="token operator">&amp;</span>new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 5. 记录到 WAL</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>old_row_id<span class="token punctuation">,</span> new_row_id<span class="token punctuation">,</span> txn<span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="查询：可见性检查">查询：可见性检查</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">MvccTable</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">scan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Transaction</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">impl</span> <span class="token class-name">Iterator</span><span class="token operator">&lt;</span><span class="token class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">Row</span><span class="token operator">></span> <span class="token operator">+</span> <span class="token lifetime-annotation symbol">'_</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>pages<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">flat_map</span><span class="token punctuation">(</span><span class="token keyword">move</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>page<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>            page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter_map</span><span class="token punctuation">(</span><span class="token keyword">move</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>row<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> txn<span class="token punctuation">.</span>snapshot<span class="token punctuation">.</span><span class="token function">is_visible</span><span class="token punctuation">(</span>row<span class="token punctuation">.</span>xmin<span class="token punctuation">,</span> row<span class="token punctuation">.</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> data<span class="token punctuation">:</span> row<span class="token punctuation">.</span>data <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">None</span>  <span class="token comment">// 不可见的版本</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>示例情景：</strong></p><pre class="language-none"><code class="language-none">Transaction A (xid&#x3D;100):          Transaction B (xid&#x3D;200):1. BEGIN;2. INSERT INTO users VALUES (1, 50);3. COMMIT;                                   4. BEGIN; (snapshot: xmin&#x3D;201, xmax&#x3D;201)                                   5. SELECT * FROM users;                                      → Sees version from txn 100 ✓6. BEGIN;7. UPDATE users SET balance &#x3D; 100 WHERE id &#x3D; 1;   (creates version 2, marks version 1 as deleted)8. (not committed yet)                                   9. SELECT * FROM users;                                      → 仍然看到版本 1！ (txn 200 尚未提交)10. COMMIT;                                   11. SELECT * FROM users;                                       → 现在看到版本 2 ✓</code></pre><hr /><h2 id="4-事务状态和可见性">4 事务状态和可见性</h2><h3 id="事务生命周期">事务生命周期</h3><pre class="language-MERMAID_BASE64_598" data-language="MERMAID_BASE64_598"><code class="language-MERMAID_BASE64_598">c3RhdGVEaWFncmFtLXYyCiAgICBbKl0gLS0+IEluUHJvZ3Jlc3M6IEJFR0lOCiAgICBJblByb2dyZXNzIC0tPiBDb21taXR0ZWQ6IENPTU1JVAogICAgSW5Qcm9ncmVzcyAtLT4gQWJvcnRlZDogUk9MTEJBQ0sKICAgIENvbW1pdHRlZCAtLT4gWypdCiAgICBBYm9ydGVkIC0tPiBbKl0KCiAgICBub3RlIHJpZ2h0IG9mIEluUHJvZ3Jlc3MKICAgICAgICB4bWluL3htYXggc2V0IHRvCiAgICAgICAgdHJhbnNhY3Rpb24gSUQKICAgIGVuZCBub3RlCgogICAgbm90ZSByaWdodCBvZiBDb21taXR0ZWQKICAgICAgICDniYjmnKzlr7nmlrDlv6vnhaflj6&#x2F;op4EKICAgIGVuZCBub3RlCgogICAgbm90ZSByaWdodCBvZiBBYm9ydGVkCiAgICAgICAg5omA5pyJ54mI5pys5qCH6K6w5Li6CiAgICAgICAg5LuO5pyq5a2Y5ZyoCiAgICBlbmQgbm90ZQ&#x3D;&#x3D;</code></pre><h3 id="可见性矩阵">可见性矩阵</h3><table><thead><tr><th>行状态</th><th>事务状态</th><th>可见？</th></tr></thead><tbody><tr><td><code>xmin &lt; snapshot.xmin</code>, <code>xmax = 0</code></td><td>Committed</td><td>✅ 是</td></tr><tr><td><code>xmin &lt; snapshot.xmin</code>, <code>xmax &lt; snapshot.xmin</code></td><td>Committed</td><td>❌ 否（已删除）</td></tr><tr><td><code>xmin &lt; snapshot.xmin</code>, <code>xmin in active</code></td><td>In Progress</td><td>❌ 否</td></tr><tr><td><code>xmin &gt;= snapshot.xmax</code></td><td>Any</td><td>❌ 否（未来）</td></tr><tr><td><code>xmin = current_txn</code></td><td>Current</td><td>✅ 是（自己的变更）</td></tr></tbody></table><hr /><h2 id="5-VACUUM：清理死版本">5 VACUUM：清理死版本</h2><h3 id="问题：死元组累积">问题：死元组累积</h3><pre class="language-none"><code class="language-none">多次更新后:Page:┌─────────────────────────────────────────────────────────────┐│ Row 0: &#123;xmin: 100, xmax: 200&#125; ← 死 (两者都已提交)       ││ Row 1: &#123;xmin: 300, xmax: 0&#125;   ← 活                        ││ Row 2: &#123;xmin: 150, xmax: 250&#125; ← 死                        ││ Row 3: &#123;xmin: 400, xmax: 0&#125;   ← 活                        ││ ...                                                         ││ 50% 死空间!                                             │└─────────────────────────────────────────────────────────────┘</code></pre><p><strong>没有 VACUUM：</strong> 表永远增长。性能下降。</p><hr /><h3 id="VACUUM-流程">VACUUM 流程</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/vacuum.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">VacuumWorker</span> <span class="token punctuation">&#123;</span>    buffer_pool<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">BufferPool</span><span class="token operator">></span><span class="token punctuation">,</span>    transaction_manager<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TransactionManager</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">VacuumWorker</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">VacuumStats</span><span class="token punctuation">,</span> <span class="token class-name">VacuumError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> stats <span class="token operator">=</span> <span class="token class-name">VacuumStats</span><span class="token punctuation">::</span><span class="token function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_id<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token comment">// 获取全局 xmin（最旧的活动事务）</span>            <span class="token keyword">let</span> global_xmin <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">get_global_xmin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// 扫描所有行</span>            <span class="token keyword">for</span> row_id <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> row <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_row</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// 如果 xmax 已提交，则行已死</span>                <span class="token keyword">if</span> row<span class="token punctuation">.</span>xmax <span class="token operator">!=</span> <span class="token number">0</span> <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmax <span class="token operator">&lt;</span> global_xmin <span class="token punctuation">&#123;</span>                    <span class="token comment">// 标记为可重用</span>                    page<span class="token punctuation">.</span><span class="token function">mark_row_free</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token punctuation">;</span>                    stats<span class="token punctuation">.</span>dead_tuples_removed <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>VACUUM 不锁定：</strong></p><table><thead><tr><th>操作</th><th>锁定级别</th></tr></thead><tbody><tr><td>Regular VACUUM</td><td><code>ShareUpdateExclusiveLock</code>（允许读取/写入）</td></tr><tr><td>VACUUM FULL</td><td><code>AccessExclusiveLock</code>（阻塞所有）</td></tr></tbody></table><hr /><h3 id="VACUUM-FULL-vs-Regular-VACUUM">VACUUM FULL vs. Regular VACUUM</h3><pre class="language-none"><code class="language-none">Regular VACUUM:┌─────────────────────────────────────────────────────────────┐│ 之前: [死][活][死][活][死][活]               ││ 之后:  [空闲][活][空闲][活][空闲][活]               ││         (空间可在同一页面中重用于新行)          │└─────────────────────────────────────────────────────────────┘VACUUM FULL:┌─────────────────────────────────────────────────────────────┐│ 之前: [死][活][死][活][死][活]               ││ 之后:  [活][活][活][空闲][空闲][空闲]               ││         (压缩，死元组物理移除)         │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum_full</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">VacuumError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. 获取排他锁</span>    <span class="token keyword">let</span> _lock <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">lock_table</span><span class="token punctuation">(</span>table_id<span class="token punctuation">,</span> <span class="token class-name">LockMode</span><span class="token punctuation">::</span><span class="token class-name">AccessExclusive</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. 创建新表文件</span>    <span class="token keyword">let</span> new_table_id <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_temp_table</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 3. 仅复制活元组</span>    <span class="token keyword">for</span> row <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">scan_live_rows</span><span class="token punctuation">(</span>table_id<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">insert_into_table</span><span class="token punctuation">(</span>new_table_id<span class="token punctuation">,</span> row<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// 4. 交换表（原子重命名）</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">swap_tables</span><span class="token punctuation">(</span>table_id<span class="token punctuation">,</span> new_table_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 5. 释放锁</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-事务-ID-回卷：40-亿行问题">6 事务 ID 回卷：40 亿行问题</h2><h3 id="数学计算">数学计算</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token keyword">u32</span>  <span class="token comment">// 0 to 4,294,967,295</span><span class="token class-name">At</span> <span class="token number">1000</span> transactions<span class="token operator">/</span>second<span class="token punctuation">:</span><span class="token number">4</span><span class="token punctuation">,</span><span class="token number">294</span><span class="token punctuation">,</span><span class="token number">967</span><span class="token punctuation">,</span><span class="token number">295</span> <span class="token operator">/</span> <span class="token number">1000</span> <span class="token operator">=</span> <span class="token number">4</span><span class="token punctuation">,</span><span class="token number">294</span><span class="token punctuation">,</span><span class="token number">967</span> seconds <span class="token operator">=</span> ~<span class="token number">50</span> days<span class="token number">50</span> 天后<span class="token punctuation">:</span> <span class="token macro property">OVERFLOW!</span> 😱</code></pre><hr /><h3 id="回卷时发生什么">回卷时发生什么</h3><pre class="language-none"><code class="language-none">回卷前:Transaction 4,294,967,294: INSERT INTO users VALUES (1, 100);Transaction 4,294,967,295: INSERT INTO users VALUES (2, 200);Transaction 0 (wrapped):   SELECT * FROM users;                           → 将 xid 4B 视为比 0 “更旧”！                           → 错误的可见性！损坏！</code></pre><p><strong>PostgreSQL 的解决方案：2 相位事务 ID</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// 带回卷处理的事务 ID 比较</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">transaction_id_precedes</span><span class="token punctuation">(</span>id1<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> id2<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 视为带符号的 32 位整数</span>    <span class="token comment">// 这使得比较能够感知回卷</span>    <span class="token punctuation">(</span>id1 <span class="token keyword">as</span> <span class="token keyword">i32</span> <span class="token operator">-</span> id2 <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> <span class="token number">0</span><span class="token punctuation">&#125;</span><span class="token comment">// Example:</span><span class="token comment">// 4,294,967,294 as i32 = -2</span><span class="token comment">// 0 as i32 = 0</span><span class="token comment">// -2 &lt; 0 → true (4B precedes 0) ✓</span></code></pre><hr /><h3 id="冻结旧事务">冻结旧事务</h3><p><strong>Vacuum freeze：</strong> 标记非常旧的事务为“冻结”</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">:</span> <span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token number">2</span><span class="token punctuation">;</span>  <span class="token comment">// 特殊值</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum_freeze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span> freeze_limit<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">VacuumError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_id<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> row_id <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> row <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_row</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// 如果 xmin 足够旧，则冻结它</span>            <span class="token keyword">if</span> row<span class="token punctuation">.</span>xmin <span class="token operator">&lt;</span> freeze_limit <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmin <span class="token operator">!=</span> <span class="token constant">FROZEN_XID</span> <span class="token punctuation">&#123;</span>                page<span class="token punctuation">.</span><span class="token function">set_xmin</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// 如果 xmax 足够旧，则冻结它</span>            <span class="token keyword">if</span> row<span class="token punctuation">.</span>xmax <span class="token operator">!=</span> <span class="token number">0</span> <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmax <span class="token operator">&lt;</span> freeze_limit <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmax <span class="token operator">!=</span> <span class="token constant">FROZEN_XID</span> <span class="token punctuation">&#123;</span>                page<span class="token punctuation">.</span><span class="token function">set_xmax</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>冻结行永远可见：</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">Snapshot</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">is_visible</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> row_xmax<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 冻结行永远可见</span>        <span class="token keyword">if</span> row_xmin <span class="token operator">==</span> <span class="token constant">FROZEN_XID</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> row_xmax<span class="token punctuation">.</span><span class="token function">map_or</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>xmax<span class="token closure-punctuation punctuation">|</span></span> xmax <span class="token operator">==</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// ... 正常的可见性逻辑 ...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Autovacuum：自动防止回卷">Autovacuum：自动防止回卷</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/autovacuum.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AutovacuumLauncher</span> <span class="token punctuation">&#123;</span>    transaction_manager<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TransactionManager</span><span class="token operator">></span><span class="token punctuation">,</span>    vacuum_worker<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">VacuumWorker</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">AutovacuumLauncher</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">run</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// 检查最旧的事务有多旧</span>            <span class="token keyword">let</span> oldest_xmin <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">get_oldest_xmin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> current_xid <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">current_xid</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// 到回卷的距离</span>            <span class="token keyword">let</span> distance_to_wraparound <span class="token operator">=</span> <span class="token class-name">TransactionId</span><span class="token punctuation">::</span><span class="token constant">MAX</span> <span class="token operator">-</span> current_xid <span class="token operator">+</span> oldest_xmin<span class="token punctuation">;</span>            <span class="token comment">// 如果接近，触发 vacuum</span>            <span class="token keyword">if</span> distance_to_wraparound <span class="token operator">&lt;</span> <span class="token constant">WRAPAROUND_EMERGENCY_THRESHOLD</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>vacuum_worker<span class="token punctuation">.</span><span class="token function">vacuum_freeze_all_tables</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token function">sleep</span><span class="token punctuation">(</span><span class="token class-name">Duration</span><span class="token punctuation">::</span><span class="token function">from_secs</span><span class="token punctuation">(</span><span class="token number">60</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>PostgreSQL 的默认阈值：</strong></p><table><thead><tr><th>参数</th><th>默认值</th><th>意义</th></tr></thead><tbody><tr><td><code>autovacuum_vacuum_threshold</code></td><td>50</td><td>vacuum 前的最小死元组</td></tr><tr><td><code>autovacuum_vacuum_scale_factor</code></td><td>0.2</td><td>+表大小的 20%</td></tr><tr><td><code>autovacuum_freeze_max_age</code></td><td>200M</td><td>强制冻结前的最大事务数</td></tr></tbody></table><hr /><h2 id="7-隔离级别">7 隔离级别</h2><h3 id="ANSI-SQL-隔离级别">ANSI SQL 隔离级别</h3><table><thead><tr><th>隔离级别</th><th>脏读</th><th>不可重复读</th><th>幻读</th></tr></thead><tbody><tr><td>Read Uncommitted</td><td>可能</td><td>可能</td><td>可能</td></tr><tr><td>Read Committed</td><td>❌ 防止</td><td>可能</td><td>可能</td></tr><tr><td>Repeatable Read</td><td>❌ 防止</td><td>❌ 防止</td><td>可能</td></tr><tr><td>Serializable</td><td>❌ 防止</td><td>❌ 防止</td><td>❌ 防止</td></tr></tbody></table><hr /><h3 id="PostgreSQL-的实现">PostgreSQL 的实现</h3><p><strong>PostgreSQL 对所有隔离级别使用 MVCC：</strong></p><table><thead><tr><th>隔离级别</th><th>实现</th></tr></thead><tbody><tr><td>Read Uncommitted</td><td>同 Read Committed</td></tr><tr><td>Read Committed</td><td>每个语句新快照</td></tr><tr><td>Repeatable Read</td><td>每个事务单一快照</td></tr><tr><td>Serializable</td><td>单一快照 + 谓词锁定</td></tr></tbody></table><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone, Copy)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">IsolationLevel</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">ReadCommitted</span><span class="token punctuation">,</span>    <span class="token class-name">RepeatableRead</span><span class="token punctuation">,</span>    <span class="token class-name">Serializable</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">get_snapshot</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Snapshot</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>isolation_level <span class="token punctuation">&#123;</span>            <span class="token class-name">IsolationLevel</span><span class="token punctuation">::</span><span class="token class-name">ReadCommitted</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// 每个语句的新快照</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">create_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IsolationLevel</span><span class="token punctuation">::</span><span class="token class-name">RepeatableRead</span> <span class="token operator">|</span> <span class="token class-name">IsolationLevel</span><span class="token punctuation">::</span><span class="token class-name">Serializable</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// 为整个事务重用相同的快照</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>cached_snapshot<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="可序列化隔离：谓词锁定">可序列化隔离：谓词锁定</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// 简化的谓词锁定</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SerializableTransaction</span> <span class="token punctuation">&#123;</span>    txn<span class="token punctuation">:</span> <span class="token class-name">Transaction</span><span class="token punctuation">,</span>    read_predicates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Predicate</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// 读取的范围/条件</span>    write_set<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">RowId</span><span class="token operator">></span><span class="token punctuation">,</span>            <span class="token comment">// 写入的行</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Predicate</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> key_range<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">BTreeKey</span><span class="token punctuation">,</span> <span class="token class-name">BTreeKey</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// None = 完整扫描</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">TransactionManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">check_serializable_conflict</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">SerializableTransaction</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">SerializationError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 检查是否有任何已提交的写入与我们的读取冲突</span>        <span class="token keyword">for</span> predicate <span class="token keyword">in</span> <span class="token operator">&amp;</span>txn<span class="token punctuation">.</span>read_predicates <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> write <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">recent_writes</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> predicate<span class="token punctuation">.</span><span class="token function">matches</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>write<span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> write<span class="token punctuation">.</span><span class="token function">committed_after</span><span class="token punctuation">(</span>txn<span class="token punctuation">.</span>snapshot<span class="token punctuation">.</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">SerializationError</span><span class="token punctuation">::</span><span class="token class-name">ReadWriteConflict</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>冲突时：</strong> 中止一个事务，带 <code>serialization_failure</code> 错误。</p><hr /><h2 id="8-用-Rust-构建的挑战">8 用 Rust 构建的挑战</h2><h3 id="挑战-1：快照生命周期">挑战 1：快照生命周期</h3><p><strong>问题：</strong> 快照需要比创建它的事务更长寿。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">begin_transaction</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> snapshot <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// 从 self 借用</span>    <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span> snapshot<span class="token punctuation">,</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>  <span class="token comment">// 快照的生命周期不够长</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：owned 快照</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">begin_transaction</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> snapshot <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// 返回 owned Snapshot</span>    <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>        snapshot<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>snapshot<span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// 可在线程间共享</span>        <span class="token punctuation">...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-2：原子事务状态">挑战 2：原子事务状态</h3><p><strong>问题：</strong> 多个线程需要看到一致的事务状态。</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ 竞争条件</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Transaction</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    txn<span class="token punctuation">.</span>state <span class="token operator">=</span> <span class="token class-name">TransactionState</span><span class="token punctuation">::</span><span class="token class-name">Committed</span><span class="token punctuation">;</span>  <span class="token comment">// 非原子!</span>    <span class="token comment">// 其他线程可能看到部分状态</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：带适当顺序的原子状态</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> xid<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> state<span class="token punctuation">:</span> <span class="token class-name">AtomicU8</span><span class="token punctuation">,</span>  <span class="token comment">// 为状态使用原子</span>    <span class="token keyword">pub</span> snapshot<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Snapshot</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. 先写 WAL（持久）</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_commit</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. 然后标记为已提交（可见）</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>state<span class="token punctuation">.</span><span class="token function">store</span><span class="token punctuation">(</span><span class="token class-name">TransactionState</span><span class="token punctuation">::</span><span class="token class-name">Committed</span> <span class="token keyword">as</span> <span class="token keyword">u8</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">Release</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// 3. 通知等待的事务</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">notify_committed</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="挑战-3：不阻塞的-VACUUM">挑战 3：不阻塞的 VACUUM</h3><p><strong>问题：</strong> 如何在事务读取时进行 VACUUM？</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ 阻塞读者</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> _lock <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>table_lock<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// 排他锁</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">remove_dead_tuples</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>解决方案：两阶段 VACUUM</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ 不阻塞</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 阶段 1：将元组标记为可修剪（无需锁）</span>    <span class="token keyword">let</span> global_xmin <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_global_xmin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">mark_pruneable</span><span class="token punctuation">(</span>global_xmin<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 阶段 2：回收空间（使用页级锁，非表锁）</span>    <span class="token keyword">for</span> page <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pages<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> _page_lock <span class="token operator">=</span> page<span class="token punctuation">.</span>lock<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">reclaim_space_on_page</span><span class="token punctuation">(</span>page<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="9-AI-如何加速这项工作">9 AI 如何加速这项工作</h2><h3 id="AI-做对了什么">AI 做对了什么</h3><table><thead><tr><th>任务</th><th>AI 贡献</th></tr></thead><tbody><tr><td><strong>可见性规则</strong></td><td>生成正确的 xmin/xmax 逻辑</td></tr><tr><td><strong>回卷处理</strong></td><td>解释二补数技巧</td></tr><tr><td><strong>快照结构</strong></td><td>建议 xmin/xmax/active 模式</td></tr><tr><td><strong>VACUUM 设计</strong></td><td>概述两阶段方法</td></tr></tbody></table><hr /><h3 id="AI-做错了什么">AI 做错了什么</h3><table><thead><tr><th>问题</th><th>发生了什么</th></tr></thead><tbody><tr><td><strong>初始可见性</strong></td><td>初稿没有处理自己未提交的写入</td></tr><tr><td><strong>冻结逻辑</strong></td><td>忽略了冻结行需要在可见性中特殊处理</td></tr><tr><td><strong>可序列化隔离</strong></td><td>建议没有谓词锁定的完全可序列化（错误！）</td></tr></tbody></table><p><strong>模式：</strong> MVCC 很微妙。AI 掌握了 80% 的情况。边界情况需要深入理解。</p><hr /><h3 id="示例：调试可见性错误">示例：调试可见性错误</h3><p><strong>我问 AI 的问题：</strong></p><blockquote><p>“事务 A 插入一行，然后查询它。但查询看不到该行。为什么？”</p></blockquote><p><strong>我学到的：</strong></p><ol><li>事务必须看到<strong>自己的</strong>未提交写入</li><li>需要在快照中跟踪 <code>current_transaction_id</code></li><li>可见性检查需要为 <code>row_xmin == my_xid</code> 特殊处理</li></ol><p><strong>结果：</strong> 修复 <code>is_visible()</code>：</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">is_visible</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> row_xmax<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span> my_xid<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 特殊情况：看到你自己的写入</span>    <span class="token keyword">if</span> row_xmin <span class="token operator">==</span> my_xid <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> row_xmax<span class="token punctuation">.</span><span class="token function">map_or</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>xmax<span class="token closure-punctuation punctuation">|</span></span> xmax <span class="token operator">!=</span> my_xid<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// ... 可见性逻辑的其余部分 ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="总结：MVCC-一张图">总结：MVCC 一张图</h2><pre class="language-MERMAID_BASE64_599" data-language="MERMAID_BASE64_599"><code class="language-MERMAID_BASE64_599">Zmxvd2NoYXJ0IEJUCiAgICBzdWJncmFwaCAi5LqL5Yqh55Sf5ZG95ZGo5pyfIgogICAgICAgIEFbQkVHSU5dIC0tPiBCW+iOt+WPluW&#x2F;q+eFp10KICAgICAgICBCIC0tPiBDW+S9v+eUqCBNVkNDIOivuy&#x2F;lhpldCiAgICAgICAgQyAtLT4gRHtDT01NSVQgb3IgUk9MTEJBQ0s&#x2F;fQogICAgICAgIEQgLS0+fENPTU1JVHwgRVvlhpnlhaUgV0FMXQogICAgICAgIEQgLS0+fFJPTExCQUNLfCBGW+S4ouW8g+WPmOabtF0KICAgICAgICBFIC0tPiBHW+agh+iusOS4uuW3suaPkOS6pF0KICAgIGVuZAoKICAgIHN1YmdyYXBoICJNVkNDIOihjOeKtuaAgSIKICAgICAgICBIW+a0uzogeG1pbiDlt7Lmj5DkuqQsIHhtYXg9MF0KICAgICAgICBJW+atuzogeG1pbiAmIHhtYXgg5bey5o+Q5LqkXQogICAgICAgIEpb5Ya757uTOiB4bWluPUZST1pFTl9YSURdCiAgICBlbmQKCiAgICBzdWJncmFwaCAiVkFDVVVNIgogICAgICAgIEtbUmVndWxhciBWQUNVVU1dIC0tPiBMW+agh+iusOepuumXtOWPr+mHjeeUqF0KICAgICAgICBNW1ZBQ1VVTSBGVUxMXSAtLT4gTlvljovnvKnooahdCiAgICAgICAgT1tBdXRvVmFjdXVtXSAtLT4gUFvpmLLmraLlm57ljbddCiAgICBlbmQKCiAgICBzdWJncmFwaCAi6ZqU56a757qn5YirIgogICAgICAgIFFbUmVhZCBDb21taXR0ZWRdIC0tPiBSW+aWsOW&#x2F;q+eFp+avj+S4quivreWPpV0KICAgICAgICBTW1JlcGVhdGFibGUgUmVhZF0gLS0+IFRb5Y2V5LiA5b+r54WnXQogICAgICAgIFVbU2VyaWFsaXphYmxlXSAtLT4gVlvosJPor43plIHlrppdCiAgICBlbmQKCiAgICBzdHlsZSBCIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgSCBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEkgZmlsbDojZmZlYmVlLHN0cm9rZTojYzYyODI4CiAgICBzdHlsZSBPIGZpbGw6I2ZmZjNlMCxzdHJva2U6I2Y1N2MwMA&#x3D;&#x3D;</code></pre><p><strong>关键要点：</strong></p><table><thead><tr><th>概念</th><th>为什么重要</th></tr></thead><tbody><tr><td><strong>MVCC</strong></td><td>读者不阻塞写入者，写入者不阻塞读者</td></tr><tr><td><strong>快照</strong></td><td>在时间点一致的数据视图</td></tr><tr><td><strong>事务 ID</strong></td><td>跟踪版本创建/删除</td></tr><tr><td><strong>VACUUM</strong></td><td>从死版本回收空间</td></tr><tr><td><strong>回卷</strong></td><td>40 亿事务限制需要冻结</td></tr><tr><td><strong>隔离级别</strong></td><td>一致性与并发性之间的权衡</td></tr></tbody></table><hr /><p><strong>进一步阅读：</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/heap/heapam_visibility.c"><code>src/backend/access/heap/heapam_visibility.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/tree/master/src/backend/access/transam"><code>src/backend/access/transam/</code></a></li><li>“A Critique of ANSI SQL Isolation Levels” by Berenson et al. (1995)</li><li>“Database Management Systems” by Ramakrishnan (Ch. 16: Concurrency Control)</li><li>“The PostgreSQL Book” by Worsley &amp; Morin (Ch. 13: MVCC)</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Vaultgres 旅程第三部分：实现 MVCC 以实现非阻塞读取和快照隔离。深入探讨事务 ID、可见性规则、VACUUM，以及事务 ID 回卷的噩梦。</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
  <entry>
    <title>Database in Rust: MVCC and Transaction Management</title>
    <link href="https://neo01.com/2026/03/Database-Rust-MVCC-Transaction-Manager/"/>
    <id>https://neo01.com/2026/03/Database-Rust-MVCC-Transaction-Manager/</id>
    <published>2026-03-02T16:00:00.000Z</published>
    <updated>2026-03-14T03:03:33.716Z</updated>
    
    <content type="html"><![CDATA[<p>In <a href="/2026/03/Database-Rust-BPlusTree-Index-Concurrent-Access/">Part 2</a>, we built a concurrent B+Tree index. But there’s a fundamental problem with our approach.</p><p><strong>Readers block writers. Writers block readers.</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Current implementation</span><span class="token keyword">let</span> lock <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">read</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Reader acquires lock</span><span class="token comment">// Writer waits... and waits... and waits...</span><span class="token comment">// Reader still holding lock (maybe computing something expensive)</span><span class="token comment">// Writer: 😭</span></code></pre><p>This is unacceptable for a real database. PostgreSQL handles <strong>thousands of concurrent transactions</strong> with readers never blocking writers. How?</p><p><strong>MVCC: Multi-Version Concurrency Control.</strong></p><p>Today: implementing MVCC in Rust with snapshot isolation, transaction management, and confronting the nightmare of transaction ID wraparound.</p><hr /><h2 id="1-The-MVCC-Insight">1 The MVCC Insight</h2><h3 id="The-Problem-Locking-Is-Too-Restrictive">The Problem: Locking Is Too Restrictive</h3><p><strong>Traditional locking (2PL):</strong></p><pre class="language-none"><code class="language-none">Transaction A: SELECT * FROM users WHERE id &#x3D; 1  -- Reads row XTransaction B: UPDATE users SET balance &#x3D; 100 WHERE id &#x3D; 1  -- Blocked!Transaction A: (still reading, maybe for 10 seconds)Transaction B: 😡 Still blocked!</code></pre><p><strong>MVCC approach:</strong></p><pre class="language-none"><code class="language-none">Transaction A: SELECT * FROM users WHERE id &#x3D; 1               -- Sees version 1 of row X (old but consistent)Transaction B: UPDATE users SET balance &#x3D; 100 WHERE id &#x3D; 1               -- Creates version 2 of row X               -- Not blocked!Both transactions proceed without blocking each other.</code></pre><hr /><h3 id="How-MVCC-Works">How MVCC Works</h3><p><strong>Every row has multiple versions:</strong></p><pre class="language-none"><code class="language-none">Row: users.id &#x3D; 1Version 1: &#123;id: 1, balance: 50,  xmin: 100, xmax: NULL&#125;           ↑                    ↑       ↑           Data                Created  Still visible                             by txn 100  (not deleted)Version 2: &#123;id: 1, balance: 100, xmin: 200, xmax: NULL&#125;           ↑                     ↑           New data          Created by txn 200Version 3: &#123;id: 1, balance: 150, xmin: 300, xmax: 400&#125;           ↑                     ↑        ↑           Old data          Created   Deleted by                             by txn 300  txn 400</code></pre><p><strong>Transaction metadata:</strong></p><table><thead><tr><th>Field</th><th>Meaning</th></tr></thead><tbody><tr><td><code>xmin</code></td><td>Transaction ID that created this version</td></tr><tr><td><code>xmax</code></td><td>Transaction ID that deleted this version (NULL = still alive)</td></tr><tr><td><code>cmin</code></td><td>Command ID within transaction (for statement-level visibility)</td></tr><tr><td><code>cmax</code></td><td>Command ID that deleted this version</td></tr></tbody></table><hr /><h2 id="2-Transaction-IDs-and-Snapshots">2 Transaction IDs and Snapshots</h2><h3 id="Transaction-ID-Allocation">Transaction ID Allocation</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/txn_id.rs</span><span class="token keyword">pub</span> <span class="token keyword">type</span> <span class="token type-definition class-name">TransactionId</span> <span class="token operator">=</span> <span class="token keyword">u32</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">INVALID_XID</span><span class="token punctuation">:</span> <span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token number">0</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FIRST_NORMAL_XID</span><span class="token punctuation">:</span> <span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token number">3</span><span class="token punctuation">;</span>  <span class="token comment">// 0, 1, 2 are bootstrap</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">TransactionIdGenerator</span> <span class="token punctuation">&#123;</span>    next_xid<span class="token punctuation">:</span> <span class="token class-name">AtomicU32</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">TransactionIdGenerator</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">new</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">Self</span> <span class="token punctuation">&#123;</span>            next_xid<span class="token punctuation">:</span> <span class="token class-name">AtomicU32</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span><span class="token constant">FIRST_NORMAL_XID</span><span class="token punctuation">)</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">next</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">TransactionId</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>next_xid<span class="token punctuation">.</span><span class="token function">fetch_add</span><span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">current</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">TransactionId</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>next_xid<span class="token punctuation">.</span><span class="token function">load</span><span class="token punctuation">(</span><span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">SeqCst</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Problem:</strong> <code>u32</code> wraps at 4 billion. What happens then?</p><p><strong>Answer:</strong> Catastrophe. We’ll handle this later.</p><hr /><h3 id="Snapshots-The-Heart-of-MVCC">Snapshots: The Heart of MVCC</h3><p>A <strong>snapshot</strong> captures which transactions are visible:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/snapshot.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Snapshot</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// Oldest active transaction</span>    <span class="token keyword">pub</span> xmax<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// Next transaction ID (nothing >= xmax is visible)</span>    <span class="token keyword">pub</span> active_transactions<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Transactions in progress</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Snapshot</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">is_visible</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> row_xmax<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Row created by transaction &lt; xmin? Always visible</span>        <span class="token keyword">if</span> row_xmin <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmin <span class="token punctuation">&#123;</span>            <span class="token comment">// Unless deleted by transaction >= xmin</span>            <span class="token keyword">return</span> row_xmax<span class="token punctuation">.</span><span class="token function">map_or</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>xmax<span class="token closure-punctuation punctuation">|</span></span> xmax <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmax<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Row created by transaction >= xmax? Never visible</span>        <span class="token keyword">if</span> row_xmin <span class="token operator">>=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmax <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Row created by active transaction? Not visible (unless it's ours)</span>        <span class="token keyword">if</span> <span class="token keyword">self</span><span class="token punctuation">.</span>active_transactions<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>row_xmin<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// Row deleted by active transaction? Still visible</span>        <span class="token keyword">if</span> <span class="token keyword">let</span> <span class="token class-name">Some</span><span class="token punctuation">(</span>xmax<span class="token punctuation">)</span> <span class="token operator">=</span> row_xmax <span class="token punctuation">&#123;</span>            <span class="token keyword">if</span> xmax <span class="token operator">&lt;</span> <span class="token keyword">self</span><span class="token punctuation">.</span>xmax <span class="token operator">&amp;&amp;</span> <span class="token operator">!</span><span class="token keyword">self</span><span class="token punctuation">.</span>active_transactions<span class="token punctuation">.</span><span class="token function">contains</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">return</span> <span class="token boolean">false</span><span class="token punctuation">;</span>  <span class="token comment">// Deleted by committed transaction</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token boolean">true</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Visual example:</strong></p><pre class="language-none"><code class="language-none">Time:     100    150    200    250    300    350          │      │      │      │      │      │Txn 100:  [&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D; committed &#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;]Txn 150:         [&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D; active &#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;&#x3D;]Txn 200:                [&#x3D;&#x3D; committed &#x3D;&#x3D;]                          ↑                    Snapshot taken here                    xmin&#x3D;150, xmax&#x3D;251, active&#x3D;[150, 200]Visibility rules:- Row created by txn 100: VISIBLE (committed before snapshot)- Row created by txn 150: NOT VISIBLE (still active)- Row created by txn 200: NOT VISIBLE (committed after snapshot started)- Row created by txn 250: NOT VISIBLE (&gt;&#x3D; xmax)</code></pre><hr /><h2 id="3-Row-Layout-with-MVCC">3 Row Layout with MVCC</h2><h3 id="Extended-Page-Header">Extended Page Header</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/storage/mvcc_page.rs</span><span class="token keyword">use</span> <span class="token keyword">crate</span><span class="token module-declaration namespace"><span class="token punctuation">::</span>transaction<span class="token punctuation">::</span></span><span class="token punctuation">&#123;</span><span class="token class-name">TransactionId</span><span class="token punctuation">,</span> <span class="token class-name">CommandId</span><span class="token punctuation">&#125;</span><span class="token punctuation">;</span><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">MVCC_ROW_HEADER_SIZE</span><span class="token punctuation">:</span> <span class="token keyword">usize</span> <span class="token operator">=</span> <span class="token number">16</span><span class="token punctuation">;</span><span class="token attribute attr-name">#[repr(C)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MvccRowHeader</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// 4 bytes</span>    <span class="token keyword">pub</span> xmax<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>      <span class="token comment">// 4 bytes (0 = not deleted)</span>    <span class="token keyword">pub</span> cmin<span class="token punctuation">:</span> <span class="token class-name">CommandId</span><span class="token punctuation">,</span>          <span class="token comment">// 4 bytes</span>    <span class="token keyword">pub</span> cmax<span class="token punctuation">:</span> <span class="token class-name">CommandId</span><span class="token punctuation">,</span>          <span class="token comment">// 4 bytes</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">MvccRow</span> <span class="token punctuation">&#123;</span>    header<span class="token punctuation">:</span> <span class="token class-name">MvccRowHeader</span><span class="token punctuation">,</span>    data<span class="token punctuation">:</span> <span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">,</span>  <span class="token comment">// Variable length</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Page layout with MVCC:</strong></p><pre class="language-none"><code class="language-none">┌─────────────────────────────────────────────────────────────┐│ PageHeader (24 bytes)                                       │├─────────────────────────────────────────────────────────────┤│ ItemId array (4 bytes each)                                 │├─────────────────────────────────────────────────────────────┤│ Free space                                                  │├─────────────────────────────────────────────────────────────┤│ Row 0: &#123;xmin, xmax, cmin, cmax&#125; + data                      ││ Row 1: &#123;xmin, xmax, cmin, cmax&#125; + data                      ││ Row 2: &#123;xmin, xmax, cmin, cmax&#125; + data                      │└─────────────────────────────────────────────────────────────┘</code></pre><hr /><h3 id="Insert-Creating-a-New-Version">Insert: Creating a New Version</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/mvcc_operations.rs</span><span class="token keyword">impl</span> <span class="token class-name">MvccTable</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">insert</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Transaction</span><span class="token punctuation">,</span> data<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">TableError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Get a page with space</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_page_for_insert</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Create row with transaction metadata</span>        <span class="token keyword">let</span> header <span class="token operator">=</span> <span class="token class-name">MvccRowHeader</span> <span class="token punctuation">&#123;</span>            xmin<span class="token punctuation">:</span> txn<span class="token punctuation">.</span>xid<span class="token punctuation">,</span>            xmax<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>  <span class="token comment">// Not deleted</span>            cmin<span class="token punctuation">:</span> txn<span class="token punctuation">.</span><span class="token function">current_command_id</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span>            cmax<span class="token punctuation">:</span> <span class="token number">0</span><span class="token punctuation">,</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">;</span>        <span class="token comment">// Insert into page</span>        <span class="token keyword">let</span> row_id <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">insert_mvcc_row</span><span class="token punctuation">(</span>header<span class="token punctuation">,</span> data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// Mark page as dirty (needs WAL)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span>page<span class="token punctuation">.</span><span class="token function">id</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>WAL record for insert:</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone)]</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">WalRecordInsert</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> offset<span class="token punctuation">:</span> <span class="token keyword">u16</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> data<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Update-Creating-a-New-Version-and-Marking-Old">Update: Creating a New Version and Marking Old</h3><p><strong>Update is actually delete + insert:</strong></p><pre class="language-MERMAID_BASE64_600" data-language="MERMAID_BASE64_600"><code class="language-MERMAID_BASE64_600">Zmxvd2NoYXJ0IFRECiAgICBBW1VQREFURSB1c2VycyBTRVQgYmFsYW5jZSA9IDEwMCBXSEVSRSBpZCA9IDFdIC0tPiBCW0ZpbmQgb2xkIHZlcnNpb25dCiAgICBCIC0tPiBDW01hcmsgb2xkIHZlcnNpb24gYXMgZGVsZXRlZF0KICAgIEMgLS0+IERbU2V0IHhtYXggPSBjdXJyZW50IHR4bl0KICAgIEQgLS0+IEVbU2V0IGNtYXggPSBjdXJyZW50IGNvbW1hbmRdCiAgICBFIC0tPiBGW0luc2VydCBuZXcgdmVyc2lvbl0KICAgIEYgLS0+IEdbTmV3IHZlcnNpb24gaGFzIG5ldyB4bWluXQogICAgRyAtLT4gSFtDb21taXQ6IGJvdGggdmVyc2lvbnMgdmlzaWJsZSB1bnRpbCBWQUNVVU1d</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">MvccTable</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">update</span><span class="token operator">&lt;</span><span class="token class-name">F</span><span class="token operator">></span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Transaction</span><span class="token punctuation">,</span> row_id<span class="token punctuation">:</span> <span class="token class-name">RowId</span><span class="token punctuation">,</span> modify<span class="token punctuation">:</span> <span class="token class-name">F</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">TableError</span><span class="token operator">></span>    <span class="token keyword">where</span>        <span class="token class-name">F</span><span class="token punctuation">:</span> <span class="token class-name">FnOnce</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token punctuation">[</span><span class="token keyword">u8</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token keyword">u8</span><span class="token operator">></span><span class="token punctuation">,</span>    <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Find old version</span>        <span class="token keyword">let</span> old_row <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_row</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. Check visibility (can only update visible rows)</span>        <span class="token keyword">if</span> <span class="token operator">!</span>txn<span class="token punctuation">.</span>snapshot<span class="token punctuation">.</span><span class="token function">is_visible</span><span class="token punctuation">(</span>old_row<span class="token punctuation">.</span>xmin<span class="token punctuation">,</span> old_row<span class="token punctuation">.</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">TableError</span><span class="token punctuation">::</span><span class="token class-name">RowNotVisible</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// 3. Mark old version as deleted</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> old_page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        old_page<span class="token punctuation">.</span><span class="token function">set_xmax</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> txn<span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token punctuation">;</span>        old_page<span class="token punctuation">.</span><span class="token function">set_cmax</span><span class="token punctuation">(</span>row_id<span class="token punctuation">.</span>offset<span class="token punctuation">,</span> txn<span class="token punctuation">.</span><span class="token function">current_command_id</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// 4. Create new version with updated data</span>        <span class="token keyword">let</span> new_data <span class="token operator">=</span> <span class="token function">modify</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>old_row<span class="token punctuation">.</span>data<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">let</span> new_row_id <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">insert</span><span class="token punctuation">(</span>txn<span class="token punctuation">,</span> <span class="token operator">&amp;</span>new_data<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 5. Log to WAL</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_update</span><span class="token punctuation">(</span>old_row_id<span class="token punctuation">,</span> new_row_id<span class="token punctuation">,</span> txn<span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Select-Visibility-Check">Select: Visibility Check</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">MvccTable</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">scan</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">Transaction</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">impl</span> <span class="token class-name">Iterator</span><span class="token operator">&lt;</span><span class="token class-name">Item</span> <span class="token operator">=</span> <span class="token class-name">Row</span><span class="token operator">></span> <span class="token operator">+</span> <span class="token lifetime-annotation symbol">'_</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>pages<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">flat_map</span><span class="token punctuation">(</span><span class="token keyword">move</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>page<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>            page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">filter_map</span><span class="token punctuation">(</span><span class="token keyword">move</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>row<span class="token closure-punctuation punctuation">|</span></span> <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> txn<span class="token punctuation">.</span>snapshot<span class="token punctuation">.</span><span class="token function">is_visible</span><span class="token punctuation">(</span>row<span class="token punctuation">.</span>xmin<span class="token punctuation">,</span> row<span class="token punctuation">.</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">Some</span><span class="token punctuation">(</span><span class="token class-name">Row</span> <span class="token punctuation">&#123;</span> data<span class="token punctuation">:</span> row<span class="token punctuation">.</span>data <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>                <span class="token punctuation">&#125;</span> <span class="token keyword">else</span> <span class="token punctuation">&#123;</span>                    <span class="token class-name">None</span>  <span class="token comment">// Invisible version</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>        <span class="token punctuation">&#125;</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Example scenario:</strong></p><pre class="language-none"><code class="language-none">Transaction A (xid&#x3D;100):          Transaction B (xid&#x3D;200):1. BEGIN;2. INSERT INTO users VALUES (1, 50);3. COMMIT;                                   4. BEGIN; (snapshot: xmin&#x3D;201, xmax&#x3D;201)                                   5. SELECT * FROM users;                                      → Sees version from txn 100 ✓6. BEGIN;7. UPDATE users SET balance &#x3D; 100 WHERE id &#x3D; 1;   (creates version 2, marks version 1 as deleted)8. (not committed yet)                                   9. SELECT * FROM users;                                      → Still sees version 1! (txn 200 not committed)10. COMMIT;                                   11. SELECT * FROM users;                                       → Now sees version 2 ✓</code></pre><hr /><h2 id="4-Transaction-States-and-Visibility">4 Transaction States and Visibility</h2><h3 id="Transaction-Lifecycle">Transaction Lifecycle</h3><pre class="language-MERMAID_BASE64_601" data-language="MERMAID_BASE64_601"><code class="language-MERMAID_BASE64_601">c3RhdGVEaWFncmFtLXYyCiAgICBbKl0gLS0+IEluUHJvZ3Jlc3M6IEJFR0lOCiAgICBJblByb2dyZXNzIC0tPiBDb21taXR0ZWQ6IENPTU1JVAogICAgSW5Qcm9ncmVzcyAtLT4gQWJvcnRlZDogUk9MTEJBQ0sKICAgIENvbW1pdHRlZCAtLT4gWypdCiAgICBBYm9ydGVkIC0tPiBbKl0KCiAgICBub3RlIHJpZ2h0IG9mIEluUHJvZ3Jlc3MKICAgICAgICB4bWluL3htYXggc2V0IHRvCiAgICAgICAgdHJhbnNhY3Rpb24gSUQKICAgIGVuZCBub3RlCgogICAgbm90ZSByaWdodCBvZiBDb21taXR0ZWQKICAgICAgICBWZXJzaW9ucyBiZWNvbWUKICAgICAgICB2aXNpYmxlIHRvIG5ldyBzbmFwc2hvdHMKICAgIGVuZCBub3RlCgogICAgbm90ZSByaWdodCBvZiBBYm9ydGVkCiAgICAgICAgQWxsIHZlcnNpb25zIG1hcmtlZAogICAgICAgIGFzIG5ldmVyIGV4aXN0ZWQKICAgIGVuZCBub3Rl</code></pre><h3 id="Visibility-Matrix">Visibility Matrix</h3><table><thead><tr><th>Row State</th><th>Transaction State</th><th>Visible?</th></tr></thead><tbody><tr><td><code>xmin &lt; snapshot.xmin</code>, <code>xmax = 0</code></td><td>Committed</td><td>✅ Yes</td></tr><tr><td><code>xmin &lt; snapshot.xmin</code>, <code>xmax &lt; snapshot.xmin</code></td><td>Committed</td><td>❌ No (deleted)</td></tr><tr><td><code>xmin &lt; snapshot.xmin</code>, <code>xmin in active</code></td><td>In Progress</td><td>❌ No</td></tr><tr><td><code>xmin &gt;= snapshot.xmax</code></td><td>Any</td><td>❌ No (future)</td></tr><tr><td><code>xmin = current_txn</code></td><td>Current</td><td>✅ Yes (own changes)</td></tr></tbody></table><hr /><h2 id="5-VACUUM-Cleaning-Up-Dead-Versions">5 VACUUM: Cleaning Up Dead Versions</h2><h3 id="The-Problem-Dead-Tuples-Accumulate">The Problem: Dead Tuples Accumulate</h3><pre class="language-none"><code class="language-none">After many updates:Page:┌─────────────────────────────────────────────────────────────┐│ Row 0: &#123;xmin: 100, xmax: 200&#125; ← Dead (both committed)       ││ Row 1: &#123;xmin: 300, xmax: 0&#125;   ← Live                        ││ Row 2: &#123;xmin: 150, xmax: 250&#125; ← Dead                        ││ Row 3: &#123;xmin: 400, xmax: 0&#125;   ← Live                        ││ ...                                                         ││ 50% dead space!                                             │└─────────────────────────────────────────────────────────────┘</code></pre><p><strong>Without VACUUM:</strong> Table grows forever. Performance degrades.</p><hr /><h3 id="VACUUM-Process">VACUUM Process</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/vacuum.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">VacuumWorker</span> <span class="token punctuation">&#123;</span>    buffer_pool<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">BufferPool</span><span class="token operator">></span><span class="token punctuation">,</span>    transaction_manager<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TransactionManager</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">VacuumWorker</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum_table</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token class-name">VacuumStats</span><span class="token punctuation">,</span> <span class="token class-name">VacuumError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> stats <span class="token operator">=</span> <span class="token class-name">VacuumStats</span><span class="token punctuation">::</span><span class="token function">default</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_id<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>            <span class="token comment">// Get global xmin (oldest active transaction)</span>            <span class="token keyword">let</span> global_xmin <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">get_global_xmin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// Scan all rows</span>            <span class="token keyword">for</span> row_id <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">let</span> row <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_row</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token comment">// Row is dead if xmax is committed</span>                <span class="token keyword">if</span> row<span class="token punctuation">.</span>xmax <span class="token operator">!=</span> <span class="token number">0</span> <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmax <span class="token operator">&lt;</span> global_xmin <span class="token punctuation">&#123;</span>                    <span class="token comment">// Mark as reusable</span>                    page<span class="token punctuation">.</span><span class="token function">mark_row_free</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token punctuation">;</span>                    stats<span class="token punctuation">.</span>dead_tuples_removed <span class="token operator">+=</span> <span class="token number">1</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>            <span class="token keyword">self</span><span class="token punctuation">.</span>buffer_pool<span class="token punctuation">.</span><span class="token function">mark_dirty</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span>stats<span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>VACUUM doesn’t lock:</strong></p><table><thead><tr><th>Operation</th><th>Lock Level</th></tr></thead><tbody><tr><td>Regular VACUUM</td><td><code>ShareUpdateExclusiveLock</code> (allows reads/writes)</td></tr><tr><td>VACUUM FULL</td><td><code>AccessExclusiveLock</code> (blocks everything)</td></tr></tbody></table><hr /><h3 id="VACUUM-FULL-vs-Regular-VACUUM">VACUUM FULL vs. Regular VACUUM</h3><pre class="language-none"><code class="language-none">Regular VACUUM:┌─────────────────────────────────────────────────────────────┐│ Before: [Dead][Live][Dead][Live][Dead][Live]               ││ After:  [Free][Live][Free][Live][Free][Live]               ││         (space reusable for new rows in same page)          │└─────────────────────────────────────────────────────────────┘VACUUM FULL:┌─────────────────────────────────────────────────────────────┐│ Before: [Dead][Live][Dead][Live][Dead][Live]               ││ After:  [Live][Live][Live][Free][Free][Free]               ││         (compacted, dead tuples physically removed)         │└─────────────────────────────────────────────────────────────┘</code></pre><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum_full</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">VacuumError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token comment">// 1. Acquire exclusive lock</span>    <span class="token keyword">let</span> _lock <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">lock_table</span><span class="token punctuation">(</span>table_id<span class="token punctuation">,</span> <span class="token class-name">LockMode</span><span class="token punctuation">::</span><span class="token class-name">AccessExclusive</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 2. Create new table file</span>    <span class="token keyword">let</span> new_table_id <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_temp_table</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 3. Copy only live tuples</span>    <span class="token keyword">for</span> row <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">scan_live_rows</span><span class="token punctuation">(</span>table_id<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">insert_into_table</span><span class="token punctuation">(</span>new_table_id<span class="token punctuation">,</span> row<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// 4. Swap tables (atomic rename)</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">swap_tables</span><span class="token punctuation">(</span>table_id<span class="token punctuation">,</span> new_table_id<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// 5. Release lock</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="6-Transaction-ID-Wraparound-The-4-Billion-Row-Problem">6 Transaction ID Wraparound: The 4 Billion Row Problem</h2><h3 id="The-Math">The Math</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token keyword">u32</span>  <span class="token comment">// 0 to 4,294,967,295</span><span class="token class-name">At</span> <span class="token number">1000</span> transactions<span class="token operator">/</span>second<span class="token punctuation">:</span><span class="token number">4</span><span class="token punctuation">,</span><span class="token number">294</span><span class="token punctuation">,</span><span class="token number">967</span><span class="token punctuation">,</span><span class="token number">295</span> <span class="token operator">/</span> <span class="token number">1000</span> <span class="token operator">=</span> <span class="token number">4</span><span class="token punctuation">,</span><span class="token number">294</span><span class="token punctuation">,</span><span class="token number">967</span> seconds <span class="token operator">=</span> ~<span class="token number">50</span> days<span class="token class-name">After</span> <span class="token number">50</span> days<span class="token punctuation">:</span> <span class="token macro property">OVERFLOW!</span> 😱</code></pre><hr /><h3 id="What-Happens-on-Wraparound">What Happens on Wraparound</h3><pre class="language-none"><code class="language-none">Before wraparound:Transaction 4,294,967,294: INSERT INTO users VALUES (1, 100);Transaction 4,294,967,295: INSERT INTO users VALUES (2, 200);Transaction 0 (wrapped):   SELECT * FROM users;                           → Sees xid 4B as &quot;older&quot; than 0!                           → Wrong visibility! CORRUPTION!</code></pre><p><strong>PostgreSQL’s solution: 2-phase transaction IDs</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Transaction ID comparison with wraparound handling</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">transaction_id_precedes</span><span class="token punctuation">(</span>id1<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> id2<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Treat as signed 32-bit integers</span>    <span class="token comment">// This makes the comparison wrap-aware</span>    <span class="token punctuation">(</span>id1 <span class="token keyword">as</span> <span class="token keyword">i32</span> <span class="token operator">-</span> id2 <span class="token keyword">as</span> <span class="token keyword">i32</span><span class="token punctuation">)</span> <span class="token operator">&lt;</span> <span class="token number">0</span><span class="token punctuation">&#125;</span><span class="token comment">// Example:</span><span class="token comment">// 4,294,967,294 as i32 = -2</span><span class="token comment">// 0 as i32 = 0</span><span class="token comment">// -2 &lt; 0 → true (4B precedes 0) ✓</span></code></pre><hr /><h3 id="Freezing-Old-Transactions">Freezing Old Transactions</h3><p><strong>Vacuum freeze:</strong> Mark very old transactions as “frozen”</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">const</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">:</span> <span class="token class-name">TransactionId</span> <span class="token operator">=</span> <span class="token number">2</span><span class="token punctuation">;</span>  <span class="token comment">// Special value</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum_freeze</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> table_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span> freeze_limit<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">VacuumError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>    <span class="token keyword">for</span> page_id <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_table_pages</span><span class="token punctuation">(</span>table_id<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> <span class="token keyword">mut</span> page <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_page</span><span class="token punctuation">(</span>page_id<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token keyword">for</span> row_id <span class="token keyword">in</span> page<span class="token punctuation">.</span><span class="token function">rows</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">let</span> row <span class="token operator">=</span> page<span class="token punctuation">.</span><span class="token function">get_row</span><span class="token punctuation">(</span>row_id<span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// If xmin is old enough, freeze it</span>            <span class="token keyword">if</span> row<span class="token punctuation">.</span>xmin <span class="token operator">&lt;</span> freeze_limit <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmin <span class="token operator">!=</span> <span class="token constant">FROZEN_XID</span> <span class="token punctuation">&#123;</span>                page<span class="token punctuation">.</span><span class="token function">set_xmin</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token comment">// If xmax is old enough, freeze it</span>            <span class="token keyword">if</span> row<span class="token punctuation">.</span>xmax <span class="token operator">!=</span> <span class="token number">0</span> <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmax <span class="token operator">&lt;</span> freeze_limit <span class="token operator">&amp;&amp;</span> row<span class="token punctuation">.</span>xmax <span class="token operator">!=</span> <span class="token constant">FROZEN_XID</span> <span class="token punctuation">&#123;</span>                page<span class="token punctuation">.</span><span class="token function">set_xmax</span><span class="token punctuation">(</span>row_id<span class="token punctuation">,</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span>    <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Frozen rows are always visible:</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">impl</span> <span class="token class-name">Snapshot</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">is_visible</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> row_xmax<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Frozen rows are always visible</span>        <span class="token keyword">if</span> row_xmin <span class="token operator">==</span> <span class="token constant">FROZEN_XID</span> <span class="token punctuation">&#123;</span>            <span class="token keyword">return</span> row_xmax<span class="token punctuation">.</span><span class="token function">map_or</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>xmax<span class="token closure-punctuation punctuation">|</span></span> xmax <span class="token operator">==</span> <span class="token constant">FROZEN_XID</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>        <span class="token comment">// ... normal visibility logic ...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Autovacuum-Preventing-Wraparound-Automatically">Autovacuum: Preventing Wraparound Automatically</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// src/transaction/autovacuum.rs</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">AutovacuumLauncher</span> <span class="token punctuation">&#123;</span>    transaction_manager<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">TransactionManager</span><span class="token operator">></span><span class="token punctuation">,</span>    vacuum_worker<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">VacuumWorker</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">AutovacuumLauncher</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">run</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">loop</span> <span class="token punctuation">&#123;</span>            <span class="token comment">// Check how old the oldest transaction is</span>            <span class="token keyword">let</span> oldest_xmin <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">get_oldest_xmin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token keyword">let</span> current_xid <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">current_xid</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token comment">// Distance to wraparound</span>            <span class="token keyword">let</span> distance_to_wraparound <span class="token operator">=</span> <span class="token class-name">TransactionId</span><span class="token punctuation">::</span><span class="token constant">MAX</span> <span class="token operator">-</span> current_xid <span class="token operator">+</span> oldest_xmin<span class="token punctuation">;</span>            <span class="token comment">// If getting close, trigger vacuum</span>            <span class="token keyword">if</span> distance_to_wraparound <span class="token operator">&lt;</span> <span class="token constant">WRAPAROUND_EMERGENCY_THRESHOLD</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>vacuum_worker<span class="token punctuation">.</span><span class="token function">vacuum_freeze_all_tables</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>            <span class="token punctuation">&#125;</span>            <span class="token function">sleep</span><span class="token punctuation">(</span><span class="token class-name">Duration</span><span class="token punctuation">::</span><span class="token function">from_secs</span><span class="token punctuation">(</span><span class="token number">60</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>PostgreSQL’s default thresholds:</strong></p><table><thead><tr><th>Parameter</th><th>Default</th><th>Meaning</th></tr></thead><tbody><tr><td><code>autovacuum_vacuum_threshold</code></td><td>50</td><td>Min dead tuples before vacuum</td></tr><tr><td><code>autovacuum_vacuum_scale_factor</code></td><td>0.2</td><td>+20% of table size</td></tr><tr><td><code>autovacuum_freeze_max_age</code></td><td>200M</td><td>Max transactions before forced freeze</td></tr></tbody></table><hr /><h2 id="7-Isolation-Levels">7 Isolation Levels</h2><h3 id="ANSI-SQL-Isolation-Levels">ANSI SQL Isolation Levels</h3><table><thead><tr><th>Isolation Level</th><th>Dirty Read</th><th>Non-Repeatable Read</th><th>Phantom Read</th></tr></thead><tbody><tr><td>Read Uncommitted</td><td>Possible</td><td>Possible</td><td>Possible</td></tr><tr><td>Read Committed</td><td>❌ Prevented</td><td>Possible</td><td>Possible</td></tr><tr><td>Repeatable Read</td><td>❌ Prevented</td><td>❌ Prevented</td><td>Possible</td></tr><tr><td>Serializable</td><td>❌ Prevented</td><td>❌ Prevented</td><td>❌ Prevented</td></tr></tbody></table><hr /><h3 id="PostgreSQL’s-Implementation">PostgreSQL’s Implementation</h3><p><strong>PostgreSQL uses MVCC for all isolation levels:</strong></p><table><thead><tr><th>Isolation Level</th><th>Implementation</th></tr></thead><tbody><tr><td>Read Uncommitted</td><td>Same as Read Committed</td></tr><tr><td>Read Committed</td><td>Fresh snapshot per statement</td></tr><tr><td>Repeatable Read</td><td>Single snapshot per transaction</td></tr><tr><td>Serializable</td><td>Single snapshot + predicate locking</td></tr></tbody></table><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token attribute attr-name">#[derive(Debug, Clone, Copy)]</span><span class="token keyword">pub</span> <span class="token keyword">enum</span> <span class="token type-definition class-name">IsolationLevel</span> <span class="token punctuation">&#123;</span>    <span class="token class-name">ReadCommitted</span><span class="token punctuation">,</span>    <span class="token class-name">RepeatableRead</span><span class="token punctuation">,</span>    <span class="token class-name">Serializable</span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">get_snapshot</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Snapshot</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">match</span> <span class="token keyword">self</span><span class="token punctuation">.</span>isolation_level <span class="token punctuation">&#123;</span>            <span class="token class-name">IsolationLevel</span><span class="token punctuation">::</span><span class="token class-name">ReadCommitted</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// New snapshot for each statement</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">create_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>            <span class="token class-name">IsolationLevel</span><span class="token punctuation">::</span><span class="token class-name">RepeatableRead</span> <span class="token operator">|</span> <span class="token class-name">IsolationLevel</span><span class="token punctuation">::</span><span class="token class-name">Serializable</span> <span class="token operator">=></span> <span class="token punctuation">&#123;</span>                <span class="token comment">// Reuse same snapshot for entire transaction</span>                <span class="token keyword">self</span><span class="token punctuation">.</span>cached_snapshot<span class="token punctuation">.</span><span class="token function">clone</span><span class="token punctuation">(</span><span class="token punctuation">)</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Serializable-Isolation-Predicate-Locking">Serializable Isolation: Predicate Locking</h3><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// Simplified predicate locking</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">SerializableTransaction</span> <span class="token punctuation">&#123;</span>    txn<span class="token punctuation">:</span> <span class="token class-name">Transaction</span><span class="token punctuation">,</span>    read_predicates<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">Predicate</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// Ranges/conditions read</span>    write_set<span class="token punctuation">:</span> <span class="token class-name">Vec</span><span class="token operator">&lt;</span><span class="token class-name">RowId</span><span class="token operator">></span><span class="token punctuation">,</span>            <span class="token comment">// Rows written</span><span class="token punctuation">&#125;</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Predicate</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> page_id<span class="token punctuation">:</span> <span class="token keyword">u64</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> key_range<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token class-name">BTreeKey</span><span class="token punctuation">,</span> <span class="token class-name">BTreeKey</span><span class="token punctuation">)</span><span class="token operator">></span><span class="token punctuation">,</span>  <span class="token comment">// None = full scan</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">TransactionManager</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">check_serializable_conflict</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token class-name">SerializableTransaction</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Result</span><span class="token operator">&lt;</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token class-name">SerializationError</span><span class="token operator">></span> <span class="token punctuation">&#123;</span>        <span class="token comment">// Check if any committed write conflicts with our reads</span>        <span class="token keyword">for</span> predicate <span class="token keyword">in</span> <span class="token operator">&amp;</span>txn<span class="token punctuation">.</span>read_predicates <span class="token punctuation">&#123;</span>            <span class="token keyword">for</span> write <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">recent_writes</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                <span class="token keyword">if</span> predicate<span class="token punctuation">.</span><span class="token function">matches</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>write<span class="token punctuation">)</span> <span class="token operator">&amp;&amp;</span> write<span class="token punctuation">.</span><span class="token function">committed_after</span><span class="token punctuation">(</span>txn<span class="token punctuation">.</span>snapshot<span class="token punctuation">.</span>xmax<span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>                    <span class="token keyword">return</span> <span class="token class-name">Err</span><span class="token punctuation">(</span><span class="token class-name">SerializationError</span><span class="token punctuation">::</span><span class="token class-name">ReadWriteConflict</span><span class="token punctuation">)</span><span class="token punctuation">;</span>                <span class="token punctuation">&#125;</span>            <span class="token punctuation">&#125;</span>        <span class="token punctuation">&#125;</span>        <span class="token class-name">Ok</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>On conflict:</strong> Abort one transaction with <code>serialization_failure</code> error.</p><hr /><h2 id="8-Challenges-Building-in-Rust">8 Challenges Building in Rust</h2><h3 id="Challenge-1-Snapshot-Lifetime">Challenge 1: Snapshot Lifetime</h3><p><strong>Problem:</strong> Snapshots need to outlive the transaction that created them.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Doesn't work</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">begin_transaction</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> snapshot <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Borrowed from self</span>    <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span> snapshot<span class="token punctuation">,</span> <span class="token punctuation">...</span> <span class="token punctuation">&#125;</span>  <span class="token comment">// Snapshot doesn't live long</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Owned snapshots</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">begin_transaction</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> snapshot <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">create_snapshot</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Returns owned Snapshot</span>    <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>        snapshot<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token punctuation">::</span><span class="token function">new</span><span class="token punctuation">(</span>snapshot<span class="token punctuation">)</span><span class="token punctuation">,</span>  <span class="token comment">// Shareable across threads</span>        <span class="token punctuation">...</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-2-Atomic-Transaction-State">Challenge 2: Atomic Transaction State</h3><p><strong>Problem:</strong> Multiple threads need to see consistent transaction state.</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Race condition</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> txn<span class="token punctuation">:</span> <span class="token operator">&amp;</span><span class="token keyword">mut</span> <span class="token class-name">Transaction</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    txn<span class="token punctuation">.</span>state <span class="token operator">=</span> <span class="token class-name">TransactionState</span><span class="token punctuation">::</span><span class="token class-name">Committed</span><span class="token punctuation">;</span>  <span class="token comment">// Not atomic!</span>    <span class="token comment">// Other threads might see partial state</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Atomic state with proper ordering</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Works</span><span class="token keyword">pub</span> <span class="token keyword">struct</span> <span class="token type-definition class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> xid<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span>    <span class="token keyword">pub</span> state<span class="token punctuation">:</span> <span class="token class-name">AtomicU8</span><span class="token punctuation">,</span>  <span class="token comment">// Use atomic for state</span>    <span class="token keyword">pub</span> snapshot<span class="token punctuation">:</span> <span class="token class-name">Arc</span><span class="token operator">&lt;</span><span class="token class-name">Snapshot</span><span class="token operator">></span><span class="token punctuation">,</span><span class="token punctuation">&#125;</span><span class="token keyword">impl</span> <span class="token class-name">Transaction</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">commit</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token comment">// 1. Write WAL first (durable)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>wal<span class="token punctuation">.</span><span class="token function">log_commit</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token operator">?</span><span class="token punctuation">;</span>        <span class="token comment">// 2. Then mark as committed (visible)</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>state<span class="token punctuation">.</span><span class="token function">store</span><span class="token punctuation">(</span><span class="token class-name">TransactionState</span><span class="token punctuation">::</span><span class="token class-name">Committed</span> <span class="token keyword">as</span> <span class="token keyword">u8</span><span class="token punctuation">,</span> <span class="token class-name">Ordering</span><span class="token punctuation">::</span><span class="token class-name">Release</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token comment">// 3. Notify waiting transactions</span>        <span class="token keyword">self</span><span class="token punctuation">.</span>transaction_manager<span class="token punctuation">.</span><span class="token function">notify_committed</span><span class="token punctuation">(</span><span class="token keyword">self</span><span class="token punctuation">.</span>xid<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h3 id="Challenge-3-Vacuum-Without-Blocking">Challenge 3: Vacuum Without Blocking</h3><p><strong>Problem:</strong> How to vacuum while transactions are reading?</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ❌ Blocks readers</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token keyword">let</span> _lock <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span>table_lock<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>  <span class="token comment">// Exclusive lock</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">remove_dead_tuples</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span><span class="token punctuation">&#125;</span></code></pre><p><strong>Solution: Two-phase vacuum</strong></p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token comment">// ✅ Non-blocking</span><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">vacuum</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Phase 1: Mark tuples as prune-able (no lock needed)</span>    <span class="token keyword">let</span> global_xmin <span class="token operator">=</span> <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">get_global_xmin</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">mark_pruneable</span><span class="token punctuation">(</span>global_xmin<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token comment">// Phase 2: Reclaim space (uses page-level locks, not table lock)</span>    <span class="token keyword">for</span> page <span class="token keyword">in</span> <span class="token keyword">self</span><span class="token punctuation">.</span>pages<span class="token punctuation">.</span><span class="token function">iter</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">&#123;</span>        <span class="token keyword">let</span> _page_lock <span class="token operator">=</span> page<span class="token punctuation">.</span>lock<span class="token punctuation">.</span><span class="token function">write</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>        <span class="token keyword">self</span><span class="token punctuation">.</span><span class="token function">reclaim_space_on_page</span><span class="token punctuation">(</span>page<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="9-How-AI-Accelerated-This">9 How AI Accelerated This</h2><h3 id="What-AI-Got-Right">What AI Got Right</h3><table><thead><tr><th>Task</th><th>AI Contribution</th></tr></thead><tbody><tr><td><strong>Visibility rules</strong></td><td>Generated correct xmin/xmax logic</td></tr><tr><td><strong>Wraparound handling</strong></td><td>Explained 2’s complement trick</td></tr><tr><td><strong>Snapshot structure</strong></td><td>Suggested xmin/xmax/active pattern</td></tr><tr><td><strong>VACUUM design</strong></td><td>Outlined two-phase approach</td></tr></tbody></table><hr /><h3 id="What-AI-Got-Wrong">What AI Got Wrong</h3><table><thead><tr><th>Issue</th><th>What Happened</th></tr></thead><tbody><tr><td><strong>Initial visibility</strong></td><td>First draft didn’t handle own uncommitted writes</td></tr><tr><td><strong>Freeze logic</strong></td><td>Missed that frozen rows need special handling in visibility</td></tr><tr><td><strong>Serializable isolation</strong></td><td>Suggested full serializability without predicate locking (wrong!)</td></tr></tbody></table><p><strong>Pattern:</strong> MVCC is subtle. AI gets the 80% case. Edge cases require deep understanding.</p><hr /><h3 id="Example-Debugging-a-Visibility-Bug">Example: Debugging a Visibility Bug</h3><p><strong>My question to AI:</strong></p><blockquote><p>“Transaction A inserts a row, then selects it. But the select doesn’t see the row. Why?”</p></blockquote><p><strong>What I learned:</strong></p><ol><li>Transactions must see their <strong>own</strong> uncommitted writes</li><li>Need to track <code>current_transaction_id</code> in snapshot</li><li>Visibility check needs special case for <code>row_xmin == my_xid</code></li></ol><p><strong>Result:</strong> Fixed <code>is_visible()</code>:</p><pre class="language-rust" data-language="rust"><code class="language-rust"><span class="token keyword">pub</span> <span class="token keyword">fn</span> <span class="token function-definition function">is_visible</span><span class="token punctuation">(</span><span class="token operator">&amp;</span><span class="token keyword">self</span><span class="token punctuation">,</span> row_xmin<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">,</span> row_xmax<span class="token punctuation">:</span> <span class="token class-name">Option</span><span class="token operator">&lt;</span><span class="token class-name">TransactionId</span><span class="token operator">></span><span class="token punctuation">,</span> my_xid<span class="token punctuation">:</span> <span class="token class-name">TransactionId</span><span class="token punctuation">)</span> <span class="token punctuation">-></span> <span class="token keyword">bool</span> <span class="token punctuation">&#123;</span>    <span class="token comment">// Special case: see your own writes</span>    <span class="token keyword">if</span> row_xmin <span class="token operator">==</span> my_xid <span class="token punctuation">&#123;</span>        <span class="token keyword">return</span> row_xmax<span class="token punctuation">.</span><span class="token function">map_or</span><span class="token punctuation">(</span><span class="token boolean">true</span><span class="token punctuation">,</span> <span class="token closure-params"><span class="token closure-punctuation punctuation">|</span>xmax<span class="token closure-punctuation punctuation">|</span></span> xmax <span class="token operator">!=</span> my_xid<span class="token punctuation">)</span><span class="token punctuation">;</span>    <span class="token punctuation">&#125;</span>    <span class="token comment">// ... rest of visibility logic ...</span><span class="token punctuation">&#125;</span></code></pre><hr /><h2 id="Summary-MVCC-in-One-Diagram">Summary: MVCC in One Diagram</h2><pre class="language-MERMAID_BASE64_602" data-language="MERMAID_BASE64_602"><code class="language-MERMAID_BASE64_602">Zmxvd2NoYXJ0IEJUCiAgICBzdWJncmFwaCAiVHJhbnNhY3Rpb24gTGlmZWN5Y2xlIgogICAgICAgIEFbQkVHSU5dIC0tPiBCW0dldCBTbmFwc2hvdF0KICAgICAgICBCIC0tPiBDW1JlYWQvV3JpdGUgd2l0aCBNVkNDXQogICAgICAgIEMgLS0+IER7Q09NTUlUIG9yIFJPTExCQUNLP30KICAgICAgICBEIC0tPnxDT01NSVR8IEVbV3JpdGUgV0FMXQogICAgICAgIEQgLS0+fFJPTExCQUNLfCBGW0Rpc2NhcmQgQ2hhbmdlc10KICAgICAgICBFIC0tPiBHW01hcmsgQ29tbWl0dGVkXQogICAgZW5kCgogICAgc3ViZ3JhcGggIk1WQ0MgUm93IFN0YXRlcyIKICAgICAgICBIW0xpdmU6IHhtaW4gY29tbWl0dGVkLCB4bWF4PTBdCiAgICAgICAgSVtEZWFkOiB4bWluICYgeG1heCBjb21taXR0ZWRdCiAgICAgICAgSltGcm96ZW46IHhtaW49RlJPWkVOX1hJRF0KICAgIGVuZAoKICAgIHN1YmdyYXBoICJWQUNVVU0iCiAgICAgICAgS1tSZWd1bGFyIFZBQ1VVTV0gLS0+IExbTWFyayBzcGFjZSByZXVzYWJsZV0KICAgICAgICBNW1ZBQ1VVTSBGVUxMXSAtLT4gTltDb21wYWN0IHRhYmxlXQogICAgICAgIE9bQXV0b1ZhY3V1bV0gLS0+IFBbUHJldmVudCB3cmFwYXJvdW5kXQogICAgZW5kCgogICAgc3ViZ3JhcGggIklzb2xhdGlvbiBMZXZlbHMiCiAgICAgICAgUVtSZWFkIENvbW1pdHRlZF0gLS0+IFJbTmV3IHNuYXBzaG90IHBlciBzdGF0ZW1lbnRdCiAgICAgICAgU1tSZXBlYXRhYmxlIFJlYWRdIC0tPiBUW1NpbmdsZSBzbmFwc2hvdF0KICAgICAgICBVW1NlcmlhbGl6YWJsZV0gLS0+IFZbUHJlZGljYXRlIGxvY2tpbmddCiAgICBlbmQKCiAgICBzdHlsZSBCIGZpbGw6I2UzZjJmZCxzdHJva2U6IzE5NzZkMgogICAgc3R5bGUgSCBmaWxsOiNlOGY1ZTksc3Ryb2tlOiMzODhlM2MKICAgIHN0eWxlIEkgZmlsbDojZmZlYmVlLHN0cm9rZTojYzYyODI4CiAgICBzdHlsZSBPIGZpbGw6I2ZmZjNlMCxzdHJva2U6I2Y1N2MwMA&#x3D;&#x3D;</code></pre><p><strong>Key Takeaways:</strong></p><table><thead><tr><th>Concept</th><th>Why It Matters</th></tr></thead><tbody><tr><td><strong>MVCC</strong></td><td>Readers don’t block writers, writers don’t block readers</td></tr><tr><td><strong>Snapshots</strong></td><td>Consistent view of data at a point in time</td></tr><tr><td><strong>Transaction IDs</strong></td><td>Track version creation/deletion</td></tr><tr><td><strong>VACUUM</strong></td><td>Reclaim space from dead versions</td></tr><tr><td><strong>Wraparound</strong></td><td>4 billion transaction limit requires freezing</td></tr><tr><td><strong>Isolation levels</strong></td><td>Trade-off between consistency and concurrency</td></tr></tbody></table><hr /><p><strong>Further Reading:</strong></p><ul><li>PostgreSQL Source: <a href="https://github.com/postgres/postgres/blob/master/src/backend/access/heap/heapam_visibility.c"><code>src/backend/access/heap/heapam_visibility.c</code></a></li><li>PostgreSQL Source: <a href="https://github.com/postgres/tree/master/src/backend/access/transam"><code>src/backend/access/transam/</code></a></li><li>“A Critique of ANSI SQL Isolation Levels” by Berenson et al. (1995)</li><li>“Database Management Systems” by Ramakrishnan (Ch. 16: Concurrency Control)</li><li>“The PostgreSQL Book” by Worsley &amp; Morin (Ch. 13: MVCC)</li><li>Vaultgres Repository: <a href="https://github.com/neoalienson/Vaultgres">github.com/neoalienson/Vaultgres</a></li></ul><!-- commentbox plugin begins -->    <div class="commentbox"></div>    <script src="https://unpkg.com/commentbox.io/dist/commentBox.min.js"></script>    <script>commentBox('5765834504929280-proj')</script>    <!-- commentbox plugin ends -->    ]]></content>
    
    
    <summary type="html">Part 3 of the Vaultgres journey: implementing MVCC for non-blocking reads and snapshot isolation. Deep dive into transaction IDs, visibility rules, vacuum, and the horror of transaction ID wraparound.</summary>
    
    
    
    <category term="Development" scheme="https://neo01.com/categories/Development/"/>
    
    
    <category term="PostgreSQL" scheme="https://neo01.com/tags/PostgreSQL/"/>
    
    <category term="Database Internals" scheme="https://neo01.com/tags/Database-Internals/"/>
    
    <category term="Rust" scheme="https://neo01.com/tags/Rust/"/>
    
  </entry>
  
</feed>
