数据结构 · #8 / 11

字典

键值映射、方法与内部机制

为什么重要

哈希表能以平均 O(1) 的时间解决一大类问题。

核心思想

dict 是一张快速查找表：键 → 值。键必须是可哈希的（所以：字符串、数字、元组可以——列表不行）。从 Python 3.7 起，字典会记住插入顺序。

试一试

构建、查找、默认值：

prices = {"apple": 1.0, "banana": 0.5}
prices["cherry"] = 2.5
print(prices)

# .get() returns a default instead of raising KeyError
print(prices.get("grape", 0.0))

# setdefault() inserts if missing, then returns the value
prices.setdefault("date", 3.0)
print(prices)

not loaded

计数——经典的"频率"惯用法：

from collections import Counter

text = "mississippi"
counts = Counter(text)
print(counts)
print(counts.most_common(2))

not loaded

迭代：items() 产出 (key, value) 元组——非常适合解包。

scores = {"Ada": 95, "Linus": 88, "Margaret": 99}
for name, score in scores.items():
  print(f"{name:>10}: {score}")

# Dict comprehension
inverted = { score: name for name, score in scores.items() }
print(inverted)

not loaded

核心字典方法（必须掌握）

| 方法 | 作用 | 示例 | | --- | --- | --- | | get(k, d) | 安全查找 | d.get("x", 0) | | keys() | 键的视图 | list(d.keys()) | | values() | 值的视图 | list(d.values()) | | items() | 键值对的视图 | list(d.items()) | | update(...) | 合并/更新 | d.update({"x":1}) | | pop(k) | 移除某个键 | d.pop("x") | | popitem() | 移除最后一个键值对 | d.popitem() | | setdefault(k, d) | 取值或设置 | d.setdefault("x", 0) | | copy() | 浅拷贝 | d.copy() | | fromkeys(keys, v) | 初始化 | dict.fromkeys(keys, 0) |

分组模式

setdefault 让"按某个键把元素分桶"变成一行代码：

groups = {}
for item in items:
    groups.setdefault(item[0], []).append(item)

两数之和（Two Sum）模式

经典的"记住你见过什么"哈希技巧——O(n) 而非 O(n²)：

def two_sum(nums, target):
    seen = {}
    for i, x in enumerate(nums):
        need = target - x
        if need in seen:
            return [seen[need], i]
        seen[x] = i

快速检查

问：如果键不存在，dict.get(k) 返回什么？答： None（或你提供的默认值）。

小练习

统计一个字符串中各字符的数量。
找出第一个重复出现的数字。
构建一个单词 → 计数的字典。

该做与不该做

该做：用 in d 来测试键是否存在。
该做：可选访问时优先用 get。
不该做：在迭代字典时修改它。
不该做：使用不可哈希的键（list、dict、set）。

深入一点——字典是如何工作的

哈希表内部机制

哈希表： 键被哈希成表中的一个索引。
冲突： 不同的键可能映射到同一个索引。
解决方式： CPython 使用带探测的开放寻址（平均很快）。
负载因子： 随着表被填满，Python 会重新调整其大小以保持查找的快速。

可哈希性规则

可哈希：int、float、str、tuple（如果其元素都可哈希）、frozenset。
不可哈希：list、dict、set。

顺序保证

Python 3.7+ 保留插入顺序（这是官方的语言保证）。

常见错误

错误： 用 [] 访问不存在的键。 修正： 使用 get()。
错误： 无意中覆盖了值。 修正： 赋值前先检查键。
错误： 用列表作为键。 修正： 使用元组或 frozenset。
错误： 以为 dict.keys() 是一个列表。 修正： 需要时用 list(...) 包裹。

关键要点

键必须可哈希；值可以是任何东西。
dict.get(k, default) 避免 KeyError；setdefault 在键不存在时插入。
collections.Counter 是频率统计的首选。
字典推导式：{k: v for ... if ...} 与列表推导式相对应。
get、set、in、pop 平均约 O(1)；在严重冲突下最坏情况为 O(n)。

🗺️ 字典

为什么重要

核心思想

试一试

核心字典方法（必须掌握）

分组模式

两数之和（Two Sum）模式

快速检查

小练习

该做与不该做

常见错误

关键要点

字典