Just to remind you, we already discussed issues related to isolation, made a digression regarding low-level data structure, and then explored row versions and observed how data snapshots are obtained from row versions.
Now we will proceed to two closely connected problems: in-page vacuum и HOT updates. Both techniques can be referred to optimizations; they are important, but virtually not covered in the documentation.
fillfactor
percent full. In this case, vacuum is performed right away without putting off till next time.fillfactor
is a storage parameter that can be defined for a table (and for an index). PostgresSQL inserts of a new row in a page only if the page is less than fillfactor
percent full. The remaining space is reserved for new tuples that are created as a result of updates. The default value for tables is 100, that is, no space is reserved (and the default value for indexes is 90).=> CREATE TABLE hot(id integer, s char(2000)) WITH (fillfactor = 75);
=> CREATE INDEX hot_id ON hot(id);
=> CREATE INDEX hot_s ON hot(s);
s
column stores only Latin characters, each row version will occupy 2004 bytes plus 24 bytes of a header. We set the fillfactor
storage parameter to 75%, which reserves just enough space for three rows.=> CREATE FUNCTION heap_page(relname text, pageno integer)
RETURNS TABLE(ctid tid, state text, xmin text, xmax text, hhu text, hot text, t_ctid tid)
AS $$
SELECT (pageno,lp)::text::tid AS ctid,
CASE lp_flags
WHEN 0 THEN 'unused'
WHEN 1 THEN 'normal'
WHEN 2 THEN 'redirect to '||lp_off
WHEN 3 THEN 'dead'
END AS state,
t_xmin || CASE
WHEN (t_infomask & 256) > 0 THEN ' (c)'
WHEN (t_infomask & 512) > 0 THEN ' (a)'
ELSE ''
END AS xmin,
t_xmax || CASE
WHEN (t_infomask & 1024) > 0 THEN ' (c)'
WHEN (t_infomask & 2048) > 0 THEN ' (a)'
ELSE ''
END AS xmax,
CASE WHEN (t_infomask2 & 16384) > 0 THEN 't' END AS hhu,
CASE WHEN (t_infomask2 & 32768) > 0 THEN 't' END AS hot,
t_ctid
FROM heap_page_items(get_raw_page(relname,pageno))
ORDER BY lp;
$$ LANGUAGE SQL;
=> CREATE FUNCTION index_page(relname text, pageno integer)
RETURNS TABLE(itemoffset smallint, ctid tid)
AS $$
SELECT itemoffset,
ctid
FROM bt_page_items(relname,pageno);
$$ LANGUAGE SQL;
=> INSERT INTO hot VALUES (1, 'A');
=> UPDATE hot SET s = 'B';
=> UPDATE hot SET s = 'C';
=> UPDATE hot SET s = 'D';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+--------+----------+----------+-----+-----+--------
(0,1) | normal | 3979 (c) | 3980 (c) | | | (0,2)
(0,2) | normal | 3980 (c) | 3981 (c) | | | (0,3)
(0,3) | normal | 3981 (c) | 3982 | | | (0,4)
(0,4) | normal | 3982 | 0 (a) | | | (0,4)
(4 rows)
fillfactor
threshold. This is clear from the difference between the pagesize
and upper
values: it exceeds the threshold equal to 75% of the page size, which makes 6144 bytes.=> SELECT lower, upper, pagesize FROM page_header(get_raw_page('hot',0));
lower | upper | pagesize
-------+-------+----------
40 | 64 | 8192
(1 row)
=> UPDATE hot SET s = 'E';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+--------+----------+-------+-----+-----+--------
(0,1) | dead | | | | |
(0,2) | dead | | | | |
(0,3) | dead | | | | |
(0,4) | normal | 3982 (c) | 3983 | | | (0,5)
(0,5) | normal | 3983 | 0 (a) | | | (0,5)
(5 rows)
hot_s
index (because page zero is occupied by metainformation):=> SELECT * FROM index_page('hot_s',1);
itemoffset | ctid
------------+-------
1 | (0,1)
2 | (0,2)
3 | (0,3)
4 | (0,4)
5 | (0,5)
(5 rows)
=> SELECT * FROM index_page('hot_id',1);
itemoffset | ctid
------------+-------
1 | (0,5)
2 | (0,4)
3 | (0,3)
4 | (0,2)
5 | (0,1)
(5 rows)
s
, and this is essential.ctid
field.=> DROP INDEX hot_s;
=> TRUNCATE TABLE hot;
=> INSERT INTO hot VALUES (1, 'A');
=> UPDATE hot SET s = 'B';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+--------+----------+-------+-----+-----+--------
(0,1) | normal | 3986 (c) | 3987 | t | | (0,2)
(0,2) | normal | 3987 | 0 (a) | | t | (0,2)
(2 rows)
ctid
chain must be followed.=> UPDATE hot SET s = 'C';
=> UPDATE hot SET s = 'D';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+--------+----------+----------+-----+-----+--------
(0,1) | normal | 3986 (c) | 3987 (c) | t | | (0,2)
(0,2) | normal | 3987 (c) | 3988 (c) | t | t | (0,3)
(0,3) | normal | 3988 (c) | 3989 | t | t | (0,4)
(0,4) | normal | 3989 | 0 (a) | | t | (0,4)
(4 rows)
=> SELECT * FROM index_page('hot_id',1);
itemoffset | ctid
------------+-------
1 | (0,1)
(1 row)
fillfactor
threshold, so the next update must cause an in-page vacuum. But this time there is a chain of updates in the page. The head of this HOT chain must always remain where it is since it is referenced by the index, while the rest of the pointers can be released: they are known to have no references from the outside.=> UPDATE hot SET s = 'E';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+---------------+----------+-------+-----+-----+--------
(0,1) | redirect to 4 | | | | |
(0,2) | normal | 3990 | 0 (a) | | t | (0,2)
(0,3) | unused | | | | |
(0,4) | normal | 3989 (c) | 3990 | t | t | (0,2)
(4 rows)
=> UPDATE hot SET s = 'F';
=> UPDATE hot SET s = 'G';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+---------------+----------+----------+-----+-----+--------
(0,1) | redirect to 4 | | | | |
(0,2) | normal | 3990 (c) | 3991 (c) | t | t | (0,3)
(0,3) | normal | 3991 (c) | 3992 | t | t | (0,5)
(0,4) | normal | 3989 (c) | 3990 (c) | t | t | (0,2)
(0,5) | normal | 3992 | 0 (a) | | t | (0,5)
(5 rows)
=> UPDATE hot SET s = 'H';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+---------------+----------+-------+-----+-----+--------
(0,1) | redirect to 5 | | | | |
(0,2) | normal | 3993 | 0 (a) | | t | (0,2)
(0,3) | unused | | | | |
(0,4) | unused | | | | |
(0,5) | normal | 3992 (c) | 3993 | t | t | (0,2)
(5 rows)
fillfactor
parameter in order to reserve some page space for updates. We should, however, take into account that the less the fillfactor
, the more free space is left in a page, so the physical size of the table increases.| => BEGIN ISOLATION LEVEL REPEATABLE READ;
| => SELECT count(*) FROM hot;
| count
| -------
| 1
| (1 row)
=> UPDATE hot SET s = 'I';
=> UPDATE hot SET s = 'J';
=> UPDATE hot SET s = 'K';
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+---------------+----------+----------+-----+-----+--------
(0,1) | redirect to 2 | | | | |
(0,2) | normal | 3993 (c) | 3994 (c) | t | t | (0,3)
(0,3) | normal | 3994 (c) | 3995 (c) | t | t | (0,4)
(0,4) | normal | 3995 (c) | 3996 | t | t | (0,5)
(0,5) | normal | 3996 | 0 (a) | | t | (0,5)
(5 rows)
=> UPDATE hot SET s = 'L';
| => COMMIT; -- snapshot no longer needed
=> SELECT * FROM heap_page('hot',0);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+---------------+----------+----------+-----+-----+--------
(0,1) | redirect to 2 | | | | |
(0,2) | normal | 3993 (c) | 3994 (c) | t | t | (0,3)
(0,3) | normal | 3994 (c) | 3995 (c) | t | t | (0,4)
(0,4) | normal | 3995 (c) | 3996 (c) | t | t | (0,5)
(0,5) | normal | 3996 (c) | 3997 | | t | (1,1)
(5 rows)
=> SELECT * FROM heap_page('hot',1);
ctid | state | xmin | xmax | hhu | hot | t_ctid
-------+--------+------+-------+-----+-----+--------
(1,1) | normal | 3997 | 0 (a) | | | (1,1)
(1 row)
=> SELECT * FROM index_page('hot_id',1);
itemoffset | ctid
------------+-------
1 | (1,1)
2 | (0,1)
(2 rows)
Unfortunately, the documentation virtually lacks information on in-page vacuum and HOT updates, and you should look for answers in the source code. I advise you to start with README.HOT.
К сожалению, не доступен сервер mySQL