Category Archives: pg_hint_plan

Using the “hint_plan” Table Provided by the PostgreSQL Extension “pg_hint_plan”

Introduction

For those who have worked with Oracle, the pg_hint_plan extension is one that will allow you to hint plans in patterns that you are likely very familiar with:

  • sql_patch
  • sql_profile
  • sql_plan_baselines

While currently, the functionality provided by pg_hint_plan is not nearly as robust (hints list), it does provide most of what you would encounter day to day as a DBA. That being said, one thing that is currently missing is the ability to easily add hints without changing code via stored_procedures / functions like in Oracle. The only way to currently do this in Open Source PostgreSQL is to manually manipulate a table named “hints” typically located in the “hint_plan” schema.

The “hints” table which is provided by the extension is highly dependent (just like Oracle) on a normalized SQL statement. A normalized SQL statement in PostgreSQL is one that has all carriage returns removed, all spaces converted to single spaces and all literals and parameters replaced with a “?”. Typically you have to do this manually, but in this blog post, I am going to show how I have leveraged entries in “pg_stat_statements” along with custom written functions to normalize the statement and place it into the “hints” table. To use this “hints” table feature, the following setting must be enabled at either the session or system level:

set session pg_hint_plan.enable_hint_table to on;
or

in the postgresql.conf:
pg_hint_plan.enable_hint_table to on;

What Does a Normalized Statement Look Like?

Typically, when you receive code from a developer or even code that you work on yourself, you format it in order to to make it human readable and easier to interpret. For example, you might want your statement to look like this (notice the parameters / literals in the statement:

SELECT
    b.bid,
    sum(abalance)
FROM
    pgbench_branches b
    JOIN pgbench_accounts a ON (b.bid = a.bid)
WHERE
    b.bid = 12345
    AND a.aid BETWEEN 100 AND 200
GROUP BY
    b.bid
ORDER BY
    1;

Now to normalize the statement for use with the “hints” table it needs to look like this:

select b.bid, sum(abalance) from pgbench_branches b join pgbench_accounts a on (b.bid = a.bid) where b.bid = ? and a.aid between ? and ? group by b.bid order by 1;

You can either manually manipulate the statement to get it in this format do this or we can attempt to do it programmatically. I prefer as much as possible to let the system format it for me so I have written a few helper scripts to do this:

Helper Queries:

**** Feel free to utilize these functions, however they may contain errors or may not normalize all statements. They depend on the pg_stat_statements table and if the entire statement will not fit within the query field of that table, then these functions will not produce the correct output. I will also place them on my public github. If you find any errors or omissions, please let me know. ****

hint_plan.display_candidate_pg_hint_plan_queries

While you can easily select from the “hints” table on your own, this query will show what a normalized statement will look like before loading it to the table. You can leave the “p_query_id” parameter null to return all queries present in the pg_stat_statements in a normalized form or you can populate it with a valid “query_id” and it will return a single normalized statement:

CREATE OR REPLACE FUNCTION hint_plan.display_candidate_pg_hint_plan_queries(
  p_query_id bigint default null
  )
  RETURNS TABLE(queryid bigint, norm_query_string text)
  LANGUAGE 'plpgsql'
  COST 100
  VOLATILE PARALLEL UNSAFE
AS $BODY$
 DECLARE 
 	pg_stat_statements_exists boolean := false;
 BEGIN
   SELECT EXISTS (
    SELECT FROM 
        information_schema.tables 
    WHERE 
        table_schema LIKE 'public' AND 
        table_type LIKE 'VIEW' AND
        table_name = 'pg_stat_statements'
    ) INTO pg_stat_statements_exists;
   IF pg_stat_statements_exists AND p_query_id is not null THEN
    RETURN QUERY
    SELECT pss.queryid,
           substr(regexp_replace(
             regexp_replace(
                regexp_replace(
                   regexp_replace(
                      regexp_replace(pss.query, '\$\d+', '?', 'g'),
                                E'\r', ' ', 'g'),
                              E'\t', ' ', 'g'),
                           E'\n', ' ', 'g'),
                         '\s+', ' ', 'g') || ';',1,100)
 	FROM pg_stat_statements pss where pss.queryid = p_query_id;
   ELSE
    RETURN QUERY
    SELECT pss.queryid,
           substr(regexp_replace(
             regexp_replace(
                regexp_replace(
                   regexp_replace(
                      regexp_replace(pss.query, '\$\d+', '?', 'g'),
                                E'\r', ' ', 'g'),
                              E'\t', ' ', 'g'),
                           E'\n', ' ', 'g'),
                         '\s+', ' ', 'g') || ';',1,100)
 	FROM pg_stat_statements pss;
   END IF;
 END; 
$BODY$;

If our candidate query was this:

select queryid, query from pg_stat_statements where queryid =  -8949523101378282526;
       queryid        |            query
----------------------+-----------------------------
 -8949523101378282526 | select b.bid, sum(abalance)+
                      | from pgbench_branches b    +
                      | join pgbench_accounts a    +
                      | on (b.bid = a.bid)         +
                      | where b.bid = $1           +
                      | group by b.bid             +
                      | order by 1
(1 row)

The display function would return the following normalized query:

SELECT hint_plan.display_candidate_pg_hint_plan_queries(p_query_id => -8949523101378282526);
-[ RECORD 1 ]--------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------
display_candidate_pg_hint_plan_queries | (-8949523101378282526,"select b.bid, sum(abalance) from pgbench_branches b join pgbench_accounts a on (b.bid = a.bid) where b.bid = ? group by b.bid order by 1;")

You can then verify that the query is normalized properly and then move on toward using the next function to add the normalized query to the “hints” table.

hint_plan.add_stored_pg_hint_plan

Using the same query in the previous section, we will now add it to the “hints” table. This is where it is important to understand what hint you want to add.

CREATE OR REPLACE FUNCTION hint_plan.add_stored_pg_hint_plan(
  p_query_id bigint,
  p_hint_text text,
  p_application_name text default ''
  )
  RETURNS varchar
  LANGUAGE 'plpgsql'
  COST 100
  VOLATILE PARALLEL UNSAFE
AS $BODY$
-- p_hint_text can contain one or more hints either separated by a space or
-- a carriage return character.  Examples include:
-- Space Separated: SeqScan(a) Parallel(a 0 hard)
-- ASCII CRLF Separated: SeqScan(a)'||chr(10)||'Parallel(a 0 hard)
-- Single Hint: SeqScan(a)
-- 
-- Escaped text does not work: /* E'SeqScan(a)\nParallel(a 0 hard)'
 DECLARE 
 	hint_id hint_plan.hints.id%TYPE;
 	normalized_query_text hint_plan.hints.norm_query_string%TYPE;
 	pg_stat_statements_exists boolean := false;
 BEGIN
   SELECT EXISTS (
    SELECT FROM 
        information_schema.tables 
    WHERE 
        table_schema LIKE 'public' AND 
        table_type LIKE 'VIEW' AND
        table_name = 'pg_stat_statements'
    ) INTO pg_stat_statements_exists;
   IF NOT pg_stat_statements_exists THEN
    RAISE NOTICE 'pg_stat_statements extension has not been loaded, exiting';
    RETURN 'error';
   ELSE
    SELECT regexp_replace(
             regexp_replace(
                regexp_replace(
                   regexp_replace(
                      regexp_replace(query, '\$\d+', '?', 'g'),
                                E'\r', ' ', 'g'),
                              E'\t', ' ', 'g'),
                           E'\n', ' ', 'g'),
                         '\s+', ' ', 'g') || ';'
 	 INTO normalized_query_text
 	 FROM pg_stat_statements where queryid = p_query_id;
     IF normalized_query_text IS NOT NULL THEN
		INSERT INTO hint_plan.hints(norm_query_string, application_name, hints)
    	VALUES (normalized_query_text,
    			p_application_name,
    			p_hint_text
    	);
    	SELECT id into hint_id
    	FROM hint_plan.hints
    	WHERE norm_query_string = normalized_query_text;
 	    RETURN cast(hint_id as text);
     ELSE
 		RAISE NOTICE 'Query ID %q does not exist in pg_stat_statements', cast(p_query_id as text);
 		RETURN 'error';
     END IF;
   END IF;
 END; 
$BODY$;

Hint text contain one or more hints either separated by a space or a carriage return character. Examples include:

  • Space Separated: SeqScan(a) Parallel(a 0 hard)
  • ASCII CRLF Separated: SeqScan(a)’||chr(10)||’Parallel(a 0 hard)
  • Single Hint: SeqScan(a)
  • Escaped text does not work in the context of this function although this can be used if you are inserting manually to the “hints” table: E’SeqScan(a)\nParallel(a 0 hard)’
SELECT hint_plan.add_stored_pg_hint_plan(p_query_id => -8949523101378282526,
						p_hint_text => 'SeqScan(a) Parallel(a 0 hard)',
						p_application_name => '');

-[ RECORD 1 ]-----------+---
add_stored_pg_hint_plan | 28

Time: 40.889 ms

select * from hint_plan.hints where id = 28;
-[ RECORD 1 ]-----+------------------------------------------------------------------------------------------------------------------------------------------
id                | 28
norm_query_string | select b.bid, sum(abalance) from pgbench_branches b join pgbench_accounts a on (b.bid = a.bid) where b.bid = ? group by b.bid order by 1;
application_name  |
hints             | SeqScan(a) Parallel(a 0 hard)

In the above example, we are forcing a serial sequential scan of the “pgbench_accounts”. We left the “application name” parameter empty so that the hint applies to any calling application.

hint_plan.delete_stored_pg_hint_plan

You could easily just issue a delete against the “hints” table, but in keeping with utilizing a “function” approach to utilizing this functionality, a delete helper has also been developed:

CREATE OR REPLACE FUNCTION hint_plan.delete_stored_pg_hint_plan(
  p_hint_id bigint
  )
  RETURNS TABLE(id integer, norm_query_string text, application_name text, hints text)
  LANGUAGE 'plpgsql'
  COST 100
  VOLATILE PARALLEL UNSAFE
AS $BODY$
 BEGIN
    RETURN QUERY
    DELETE FROM hint_plan.hints h WHERE h.id = p_hint_id RETURNING *;
 END; 
$BODY$;

To delete a plan you can call the procedure as follows:

 SELECT hint_plan.delete_stored_pg_hint_plan(p_hint_id => 28);
-[ RECORD 1 ]--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
delete_stored_pg_hint_plan | (28,"select b.bid, sum(abalance) from pgbench_branches b join pgbench_accounts a on (b.bid = a.bid) where b.bid = ? group by b.bid order by 1;","","SeqScan(a) Parallel(a 0 hard)")

Time: 33.685 ms
select * from hint_plan.hints where id = 28;
(0 rows)

Time: 24.868 ms

As you can see the “hints” table is very useful and can help you emulate many parts of SQL Plan Management just like in Oracle.

Enjoy and all feedback is welcomed!!!