Убрать из плана seq scan

polin11 · 14 сен 2022

Использую СУБД PGSQL.
Есть таблица Documents в ней поле Account - целое, User - массив строк.

Происходит удаление пользователя, нужно этого пользователя удалить из User,
если после удаления массив пустой, то нужно удалить запись.
Написал такой запрос, но он стал жутко тормозить.

PHP:

WITH user AS (

SELECT DISTINCT UNNEST(ARRAY['27672','6145']) AS "Id"

),

ar AS(

SELECT

"@Id" AS "Id",

array_remove("User", user."Id") AS "NewUser",

CASE

WHEN array_length(array_remove("User", user."Id"), 1) > 0 THEN FALSE

ELSE TRUE

END AS "NeedDelete"

FROM "Documents" ar, user

WHERE

"Account" = 4777912 AND

array_position(ar."User", user."Id") IS NOT NULL

),

delete_ar AS(

DELETE FROM "Documents"

USING ar

WHERE "@Id" = ar."Id" AND

ar."NeedDelete" IS TRUE

)

UPDATE "Documents"

SET "User" = ar."NewUser"

FROM ar

WHERE ar."Id" = "@Id" AND

ar."NeedDelete" IS FALSE

PHP:

Update on "Documents" (cost=19771027.17..22312135.48 rows=2074177 width=134) (actual time=188992.755..188992.755 rows=0 loops=1)

Buffers: shared hit=22742 read=2790563 dirtied=121272 written=56091, temp read=8246 written=1402207

CTE user

-> HashAggregate (cost=0.02..0.52 rows=100 width=32) (actual time=0.007..0.010 rows=2 loops=1)

Group Key: unnest('{27672,6145}'::text[])

-> Result (cost=0.00..0.01 rows=1 width=32) (actual time=0.003..0.004 rows=2 loops=1)

CTE ar

-> Nested Loop (cost=0.57..237640.03 rows=4148354 width=37) (actual time=59.829..484.828 rows=44 loops=1)

Join Filter: (array_position(ar_1."User", user."Id") IS NOT NULL)

Rows Removed by Join Filter: 24234

Buffers: shared hit=7660 read=1433 dirtied=13

-> CTE Scan on user (cost=0.00..2.00 rows=100 width=32) (actual time=0.008..0.015 rows=2 loops=1)

-> Materialize (cost=0.57..133720.72 rows=41692 width=35) (actual time=0.649..236.659 rows=12139 loops=2)

Buffers: shared hit=7660 read=1433 dirtied=13

-> Index Scan using "iDocuments-AccountDocument" on "Documents" ar_1 (cost=0.57..133512.26 rows=41692 width=35) (actual time=1.296..466.032 rows=12139 loops=1)

Index Cond: ("Account" = '4777912'::bigint)

Buffers: shared hit=7660 read=1433 dirtied=13

CTE delete_ar

-> Delete on "Documents" "Documents_1" (cost=8512344.15..10206749.46 rows=2074177 width=34) (actual time=148112.808..148112.808 rows=0 loops=1)

Buffers: shared hit=3422 read=2800776 dirtied=29, temp read=59703 written=801388

-> Hash Join (cost=8512344.15..10206749.46 rows=2074177 width=34) (actual time=143942.901..148111.006 rows=38 loops=1)

Hash Cond: (ar_2."Id" = "Documents_1"."@Id")

Buffers: shared hit=3388 read=2800772 dirtied=3, temp read=59703 written=801388

-> CTE Scan on ar ar_2 (cost=0.00..82967.08 rows=2074177 width=32) (actual time=0.030..0.104 rows=38 loops=1)

Filter: ("NeedDelete" IS TRUE)

Rows Removed by Filter: 6

-> Hash (cost=4888739.29..4888739.29 rows=208459029 width=10) (actual time=143820.738..143820.738 rows=205483044 loops=1)

Buckets: 1048576 Batches: 512 Memory Usage: 25459kB

Buffers: shared hit=3388 read=2800772 dirtied=3, temp written=800839

-> Seq Scan on "Documents" "Documents_1" (cost=0.00..4888739.29 rows=208459029 width=10) (actual time=0.040..85509.087 rows=205483044 loops=1)

Buffers: shared hit=3388 read=2800772 dirtied=3

-> Hash Join (cost=9326637.15..11867745.46 rows=2074177 width=134) (actual time=187388.784..188988.878 rows=6 loops=1)

Hash Cond: (ar."Id" = "Documents"."@Id")

Buffers: shared hit=22689 read=2790553 dirtied=121259 written=56091, temp read=8246 written=1402207

-> CTE Scan on ar (cost=0.00..82967.08 rows=2074177 width=96) (actual time=297.173..484.991 rows=6 loops=1)

Filter: ("NeedDelete" IS FALSE)

Rows Removed by Filter: 38

Buffers: shared hit=7660 read=1433 dirtied=13

-> Hash (cost=4888739.29..4888739.29 rows=208459029 width=42) (actual time=186809.680..186809.680 rows=205483044 loops=1)

Buckets: 524288 Batches: 1024 Memory Usage: 17420kB

Buffers: shared hit=15029 read=2789120 dirtied=121246 written=56091, temp written=1401178

-> Seq Scan on "Documents" (cost=0.00..4888739.29 rows=208459029 width=42) (actual time=0.712..100785.196 rows=205483044 loops=1)

Buffers: shared hit=15029 read=2789120 dirtied=121246 written=56091

Хочется переписать запрос, чтобы не было seq scan,
а использовался индекс @ID из Documents при удалении и обновлении записей

Быстрый поиск

Убрать из плана seq scan

polin11 Активный пользователь