Wednesday, 11 September 2013

Why does this SUM() function take so long in PostgreSQL?

Why does this SUM() function take so long in PostgreSQL?

This is my query:
SELECT SUM(amount) FROM bill WHERE name = 'peter'
There are 800K+ rows in the table. EXPLAIN ANALYZE says:
Aggregate (cost=288570.06..288570.07 rows=1 width=4) (actual
time=537213.327..537213.328 rows=1 loops=1)
-> Seq Scan on bill (cost=0.00..288320.94 rows=498251 width=4) (actual
time=48385.201..535941.041 rows=800947 loops=1)
Filter: ((name)::text = 'peter'::text)
Rows Removed by Filter: 8
Total runtime: 537213.381 ms
All rows are affected, and this is correct. But why so long? A similar
query without WHERE runs way faster:
ANALYZE EXPLAIN SELECT SUM(amount) FROM bill
Aggregate (cost=137523.31..137523.31 rows=1 width=4) (actual
time=2198.663..2198.664 rows=1 loops=1)
-> Index Only Scan using idx_amount on bill (cost=0.00..137274.17
rows=498268 width=4) (actual time=0.032..1223.512 rows=800955 loops=1)
Heap Fetches: 533399
Total runtime: 2198.717 ms
I have an index on amount and an index on name. Have I missed any indexes?
ps. I managed to solve the problem just by adding a new idex ON bill(name,
amount). I didn't get why it helped, so let's leave the question open for
some time...

No comments:

Post a Comment