PostgreSQL Group By 不走索引?另类解决方案

而MySQL Group By 走索引很快

PostgreSQL版本为18,MySQL版本为8.0。

PostgreSQL表现

PostgreSQL建一张表,插入1000万条数据,content字段为为107个字符串之一

create table test1(
  id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, 
  content varchar(100)
);
CREATE INDEX test_idx ON test1 USING btree (content);

insert into test1(content)
select REPEAT(((random() * 100000)::bigint % 107)::text, 11)
from generate_series(1, 10000000);

drop index test_idx;
create index test_idx on test1(content);

vacuum test1;
analyze test1;

执行语句如下:

set max_parallel_workers_per_gather = 0; -- 禁用并行扫描
set jit = off; -- 关闭JIT

explain analyze verbose
select content
from test1
group by content;

得到的执行计划如下,耗时2.8s:

HashAggregate  (cost=229831.01..229832.08 rows=107 width=22) (actual time=2830.261..2830.270 rows=107.00 loops=1)
  Output: content
  Group Key: test1.content
  Batches: 1  Memory Usage: 32kB
  Buffers: shared hit=92331
  ->  Seq Scan on test.test1  (cost=0.00..202331.01 rows=11000001 width=22) (actual time=0.014..752.663 rows=11000001.00 loops=1)
        Output: id, content
        Buffers: shared hit=92331
Planning Time: 0.093 ms
Execution Time: 2830.309 ms

强制走索引

set enable_seqscan=false; -- 禁止全表扫
set max_parallel_workers_per_gather = 0; -- 禁用并行扫描
set jit = off; -- 关闭JIT

explain analyze verbose
select content
from test1
group by content;

得到的执行计划如下,耗时1.4s:

Group  (cost=0.43..209778.68 rows=107 width=22) (actual time=0.055..1434.377 rows=107.00 loops=1)
  Output: content
  Group Key: test1.content
  Buffers: shared hit=321 read=8628
  ->  Index Only Scan using test_idx on test.test1  (cost=0.43..184779.50 rows=9999671 width=22) (actual time=0.054..810.491 rows=10000000.00 loops=1)
        Output: content
        Heap Fetches: 0
        Index Searches: 1
        Buffers: shared hit=321 read=8628
Planning:
  Buffers: shared hit=24
Planning Time: 0.129 ms
Execution Time: 1434.431 ms

可用看出,PostgreSQL即使强制走索引,虽然查询速度虽然有提升,但查询速度还是和表数据量有关,不能秒出结果。

MySQL表现

建表插入数据约800万条:

create table test1(
	id bigint not NULL AUTO_INCREMENT  PRIMARY KEY, 
	content VARCHAR(100)
);

-- 随便插入一条数据后
-- 反复执行该语句直到表里有800万条数据
INSERT into test1 (content)
select REPEAT(CONVERT(rand() * 100000 % 107, UNSIGNED INTEGER), 11) as x
from test1;

drop index test_idx  on test1;
create index test_idx on test1(content);

explain analyze
select content
from  test1
group by content;

有索引的情况下,耗时1ms,秒出结果:

-> Covering index skip scan for deduplication on test1 using test_idx  (cost=27458 rows=18305) (actual time=0.0868..1.26 rows=108 loops=1)

没有索引的情况下,全表扫耗时5.3s:

-> Table scan on <temporary>  (cost=1.66e+6..1.76e+6 rows=8.14e+6) (actual time=5395..5395 rows=108 loops=1)
    -> Temporary table with deduplication  (cost=1.66e+6..1.66e+6 rows=8.14e+6) (actual time=5395..5395 rows=108 loops=1)
        -> Table scan on test1  (cost=842067 rows=8.14e+6) (actual time=0.946..2980 rows=8.39e+6 loops=1)

显然这种简单group by+有索引的场景下,MySQL的查询速度和表数据量无关,比PostgreSQL要好。

当然PostgreSQL也不是没有优点,显然其全表扫比MySQL要快很多,建索引的速度比MySQL也快(7s vs 40s),还支持JIT和并行执行,这些都比MySQL强。

旁门左道

那么PostgreSQL到底有一些偏方来快速解决“检索某一列的唯一不同值”这个问题呢,有的。

思路打开一下:我们寄希望于Group By 能使用索引快速查询。究其原因,我们是希望不需要扫描整个索引,只扫描索引中我们需要的上层B树节点就行。

我们完全可用手动实现这个过程,先取出这一列的当前值作为最小值,然后通过递归,不断从列中取得比当前值大的最小值并替换当前值,直到取完所有值。那么当前值取过的集合,就是唯一不同值的集合。

显然,取最小值和查找比当前值大的最小值的这个过程,完全可以走索引的。我们可以用递归CTE实现这些步骤:

explain analyze verbose
WITH RECURSIVE cte AS (
   (
   SELECT content
   FROM   test1
   ORDER  BY 1
   LIMIT  1
   )
   UNION ALL
   SELECT l.*
   FROM   cte c
   CROSS JOIN LATERAL (
      SELECT t.content
      FROM   test1 t
      WHERE  t.content > c.content
      ORDER  BY 1
      LIMIT  1
      ) l
   )
SELECT * FROM cte;

查看执行计划,耗时不到1ms,只需0.8ms,非常完美:

CTE Scan on cte  (cost=50.06..52.08 rows=101 width=218) (actual time=0.020..0.770 rows=107.00 loops=1)
  Output: cte.content
  Storage: Memory  Maximum Storage: 22kB
  Buffers: shared hit=338
  CTE cte
    ->  Recursive Union  (cost=0.43..50.06 rows=101 width=22) (actual time=0.019..0.735 rows=107.00 loops=1)
          Storage: Memory  Maximum Storage: 33kB
          Buffers: shared hit=338
          ->  Limit  (cost=0.43..0.45 rows=1 width=22) (actual time=0.018..0.018 rows=1.00 loops=1)
                Output: test1.content
                Buffers: shared hit=4
                ->  Index Only Scan using test_idx on compub.test1  (cost=0.43..184789.84 rows=10000360 width=22) (actual time=0.017..0.017 rows=1.00 loops=1)
                      Output: test1.content
                      Heap Fetches: 0
                      Index Searches: 1
                      Buffers: shared hit=4
          ->  Nested Loop  (cost=0.43..4.86 rows=10 width=22) (actual time=0.006..0.006 rows=0.99 loops=107)
                Output: t.content
                Buffers: shared hit=334
                ->  WorkTable Scan on cte c  (cost=0.00..0.20 rows=10 width=218) (actual time=0.000..0.000 rows=1.00 loops=107)
                      Output: c.content
                ->  Limit  (cost=0.43..0.46 rows=1 width=22) (actual time=0.006..0.006 rows=0.99 loops=107)
                      Output: t.content
                      Buffers: shared hit=334
                      ->  Index Only Scan using test_idx on compub.test1 t  (cost=0.43..69931.86 rows=3333453 width=22) (actual time=0.006..0.006 rows=0.99 loops=107)
                            Output: t.content
                            Index Cond: (t.content > (c.content)::text)
                            Heap Fetches: 0
                            Index Searches: 107
                            Buffers: shared hit=334
Planning Time: 0.154 ms
Execution Time: 0.798 ms

旁门左道之旁门左道

然而某些国产数据库,不支持LATERAL join,只能改写为使用相关子查询实现了:

explain analyze verbose
WITH RECURSIVE cte AS (
    -- 锚点成员:取最小的content(同原逻辑)
    (
        SELECT content
        FROM test1
        ORDER BY content  -- 显式按content排序,取最小的1个
        LIMIT 1
    )
    UNION ALL
    -- 递归成员:用相关子查询找下一个值,用EXISTS控制递归终止
    SELECT (
        -- 相关子查询:为当前c.content找最小的更大值
        SELECT t.content
        FROM test1 t
        WHERE t.content > c.content  -- 只找比当前值大的
        ORDER BY t.content  -- 排序后取最小的
        LIMIT 1  -- 确保只返回1个值
    ) AS content
    FROM cte c -- 每一次递归都是一行当前值
    -- 过滤条件:只有存在更大值时才继续递归
    WHERE EXISTS (
        SELECT 1
        FROM test1 t
        WHERE t.content > c.content
    )
)
SELECT * FROM cte;

测试结果相差不大,耗时1.1ms:

CTE Scan on cte  (cost=61.81..62.43 rows=31 width=218) (actual time=0.029..1.113 rows=107.00 loops=1)
  Output: cte.content
  Storage: Memory  Maximum Storage: 22kB
  Buffers: shared hit=669
  CTE cte
    ->  Recursive Union  (cost=0.43..61.81 rows=31 width=218) (actual time=0.028..1.078 rows=107.00 loops=1)
          Storage: Memory  Maximum Storage: 33kB
          Buffers: shared hit=669
          ->  Limit  (cost=0.43..0.45 rows=1 width=22) (actual time=0.027..0.027 rows=1.00 loops=1)
                Output: test1.content
                Buffers: shared hit=4
                ->  Index Only Scan using test_idx on compub.test1  (cost=0.43..184789.84 rows=10000360 width=22) (actual time=0.026..0.026 rows=1.00 loops=1)
                      Output: test1.content
                      Heap Fetches: 0
                      Index Searches: 1
                      Buffers: shared hit=4
          ->  Nested Loop Semi Join  (cost=0.43..6.10 rows=3 width=218) (actual time=0.009..0.009 rows=0.99 loops=107)
                Output: (SubPlan 1)
                Buffers: shared hit=665
                ->  WorkTable Scan on cte c  (cost=0.00..0.20 rows=10 width=218) (actual time=0.000..0.000 rows=1.00 loops=107)
                      Output: c.content
                ->  Index Only Scan using test_idx on compub.test1 t_1  (cost=0.43..61814.26 rows=3333453 width=22) (actual time=0.005..0.005 rows=0.99 loops=107)
                      Output: t_1.content
                      Index Cond: (t_1.content > (c.content)::text)
                      Heap Fetches: 0
                      Index Searches: 107
                      Buffers: shared hit=334
                SubPlan 1
                  ->  Limit  (cost=0.43..0.46 rows=1 width=22) (actual time=0.003..0.003 rows=1.00 loops=106)
                        Output: t.content
                        Buffers: shared hit=331
                        ->  Index Only Scan using test_idx on compub.test1 t  (cost=0.43..69931.86 rows=3333453 width=22) (actual time=0.003..0.003 rows=1.00 loops=106)
                              Output: t.content
                              Index Cond: (t.content > (c.content)::text)
                              Heap Fetches: 0
                              Index Searches: 106
                              Buffers: shared hit=331
Planning Time: 0.177 ms
Execution Time: 1.153 ms

注意:递归CTE是有限制的,数据库也许会限制递归的深度,也许会爆栈,请仔细测试后再使用。

vacuum full性能暴跌彩蛋:

PostgreSQL中,如果对表进行vacuum full,然后强制其走索引,得到的执行计划如下:

Group  (cost=0.43..600032.52 rows=107 width=22) (actual time=0.019..6437.286 rows=107.00 loops=1)
  Output: content
  Group Key: test1.content
  Buffers: shared hit=6660031
  ->  Index Only Scan using test_idx on test.test1  (cost=0.43..572532.52 rows=11000001 width=22) (actual time=0.018..5664.125 rows=11000001.00 loops=1)
        Output: content
        Heap Fetches: 11000001
        Index Searches: 1
        Buffers: shared hit=6660031
Planning Time: 0.073 ms
Execution Time: 6437.376 ms

根据Kimi的解释:VACUUM FULL 后,Visibility Map 被清空,导致 Index-Only Scan 退化,必须回表查可见性,性能暴跌。

所以不要轻易使用vacuum full,平时autovacuum和vacuum就够了,如果必须 VACUUM FULL,则需要之后手动跑一次 VACUUM 来修复 VM,就能恢复查询性能了。