삽질/개발,엔지니어링

PostgreSQL JSONB 인덱스 생성 전후 성능 확인

maengis 2023. 9. 5. 15:53

https://medium.com/geekculture/postgres-jsonb-usage-and-performance-analysis-cdbd1242a018

 

Postgres JSONB Usage and performance analysis

This story focuses on various features that JSONB provides and formulated with a sample and explained with scenarios where unstructured…

medium.com

 

무슨 데이터가 들어올지 모르는 상황이라 JSONB를 쓰는 게 좋을 거 같아서 확인 했다.

테스트에 사용한 데이터는 1,000만 건이고, 사용한 데이터는  https://dummyjson.com/users?limit=100 인데, 여기에 random_string이라는 키에 알파벳 소문자를 임의로 20자 넣고 이걸 인덱스 걸어서 확인 했다.

user['random_string'] = ''.join(random.choice(string.ascii_lowercase) for _ in range(20))
CREATE TABLE test_table (
    idx serial PRIMARY KEY,
    foobar jsonb
);

-- 인덱스 생성
CREATE INDEX idx_foobar_random_string ON test_table ((foobar→>'random_string'));

 

1. 인덱스 생성 전 =(equal) 조건 조회 (23830.399 ms)

logeye=> EXPLAIN ANALYZE SELECT COUNT(*) FROM test_table WHERE foobar->>'random_string' = 'kmynbffmjgrueozktkus';
                                                                   QUERY PLAN                                
                                    
-------------------------------------------------------------------------------------------------------------
------------------------------------
 Finalize Aggregate  (cost=2055842.46..2055842.47 rows=1 width=8) (actual time=23869.751..23869.751 rows=1 lo
ops=1)
   ->  Gather  (cost=2055842.24..2055842.45 rows=2 width=8) (actual time=23869.717..23869.746 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=2054842.24..2054842.25 rows=1 width=8) (actual time=23867.279..23867.28
0 rows=1 loops=3)
               ->  Parallel Seq Scan on test_table  (cost=0.00..2054796.58 rows=18265 width=0) (actual time=1
5910.766..23867.273 rows=0 loops=3)
                     Filter: ((foobar ->> 'random_string'::text) = 'kmynbffmjgrueozktkus'::text)
                     Rows Removed by Filter: 3333333
 Planning time: 0.046 ms
 Execution time: 23870.399 ms
(10 rows)

2. 인덱스 생성 후 =(equal) 조건 조회 (0.091 ms)

=> EXPLAIN ANALYZE SELECT COUNT(*) FROM test_table WHERE foobar->>'random_string' = 'kmynbffmjgrueozktkus';
QUERY PLAN
 
-------------------------------------------------------------------------------------------------------------
---------------------------------
Aggregate (cost=176507.63..176507.64 rows=1 width=8) (actual time=0.045..0.045 rows=1 loops=1)
-> Bitmap Heap Scan on test_table (cost=1380.06..176382.63 rows=50000 width=0) (actual time=0.040..0.040
rows=1 loops=1)
Recheck Cond: ((foobar ->> 'random_string'::text) = 'kmynbffmjgrueozktkus'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_foobar_random_string (cost=0.00..1367.56 rows=50000 width=0) (actual t
ime=0.036..0.036 rows=1 loops=1)
Index Cond: ((foobar ->> 'random_string'::text) = 'kmynbffmjgrueozktkus'::text)
Planning time: 0.267 ms
Execution time: 0.091 ms
(8 rows)

3. 인덱스 생성 전 LIKE(startswith) 조건 조회 (23471.324 ms)

=> EXPLAIN ANALYZE SELECT COUNT(*) FROM test_table WHERE foobar->>'random_string' LIKE 'gf%';
QUERY PLAN
 
-------------------------------------------------------------------------------------------------------------
-----------------------------------
Finalize Aggregate (cost=2055842.46..2055842.47 rows=1 width=8) (actual time=24270.590..24270.590 rows=1 lo
ops=1)
-> Gather (cost=2055842.24..2055842.45 rows=2 width=8) (actual time=24270.564..24270.584 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=2054842.24..2054842.25 rows=1 width=8) (actual time=24268.166..24268.16
7 rows=1 loops=3)
-> Parallel Seq Scan on test_table (cost=0.00..2054796.58 rows=18265 width=0) (actual time=3
.885..24265.441 rows=4921 loops=3)
Filter: ((foobar ->> 'random_string'::text) ~~ 'gf%'::text)
Rows Removed by Filter: 3328412
Planning time: 0.093 ms
Execution time: 24271.324 ms
(10 rows)

4. 인덱스 생성 후 LIKE(startswith) 조건 조회 (743.693 ms)

=> EXPLAIN ANALYZE SELECT COUNT(*) FROM test_table WHERE foobar->>'random_string' LIKE 'gf%';
QUERY PLAN
 
-------------------------------------------------------------------------------------------------------------
-------------------------------------
Aggregate (cost=176632.63..176632.64 rows=1 width=8) (actual time=743.653..743.653 rows=1 loops=1)
-> Bitmap Heap Scan on test_table (cost=1505.06..176507.63 rows=50000 width=0) (actual time=5.665..739.8
87 rows=14763 loops=1)
Filter: ((foobar ->> 'random_string'::text) ~~ 'gf%'::text)
Heap Blocks: exact=14726
-> Bitmap Index Scan on idx_foobar_random_string (cost=0.00..1492.56 rows=50000 width=0) (actual t
ime=3.363..3.363 rows=14763 loops=1)
Index Cond: (((foobar ->> 'random_string'::text) >= 'gf'::text) AND ((foobar ->> 'random_strin
g'::text) < 'gg'::text))
Planning time: 0.203 ms
Execution time: 743.693 ms
(8 rows)

 

 

반응형