streambion.blogg.se - Postgres ilike

The Btree index doesn’t support for wildcard searches. Now Postgres query optimizer had used the index that we created earlier. Now let us re arrange the query to something like thisĮxplain analyze select * from user_details where ( upper(first_name)) = ‘UHXMMMM’ QUERY PLAN - Bitmap Heap Scan on user_details (cost=1443.74 rows=50750 width=98) (actual time=0.027.0.027 Recheck Cond: (upper(first_name) = 'UHXMMMM'::text) -> Bitmap Index Scan on in_user_details_first_name_last_name (cost=.18 rows=50750 width=0) Index Cond: (upper(first_name) = 'UHXMMMM'::text) Planning Time: 0.098 ms JIT: Functions: 2 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 0.756 ms, Inlining 0.000 ms, Optimization 0.000 ms, Emission 0.000 ms, Total 0.756 m Execution Time: 0.905 ms Filter: ((upper(first_name) ~~ '%UHX%'::text) OR (upper(last_name) ~~ '%UHX%'::text)) Rows Removed by Filter: 3380952 Planning Time: 4.436 ms JIT: Functions: 6 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 1.492 ms, Inlining 0.000 ms, Optimization 1.868 ms, Emission 15.460 ms, Total 18.820 Execution Time: 1041.511 msĮven if there is an index postgres query optimizer didn’t scan index instead did sequential scan.

The order is important in compound indexes otherwise the index won’t be scanned by postgres query optimizer. Let us try to execute query again as index is created on both the columns. CREATE INDEX in_user_details_first_name_last_name ON user_details USING btree ( upper(first_name), upper(last_name))

Here, in this case index has to be created for upper case for both first_name and last_name as case insensitivity involved. We do not have any indexes on columns, the query execution time > 1 sec and this can increase more if more rows matches for the wildcard.īasic way to improve query performance is to create index. The sequential scan is performed on the query. Query: explain analyze select * from user_details where ( upper (first_name) like '%UHX%' or upper (last_name) like '%UHX%') QUERY PLAN - Gather (cost=1053.53 rows=795761 width=98) (actual time=3.773.1023.569 rows=7156 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on user_details (cost=77.42 rows=331567 width=98) (actual time=6.108.1 Filter: ((upper(first_name) ~~ '%UHX%'::text) OR (upper(last_name) ~~ '%UHX%'::text)) Rows Removed by Filter: 3380952 Planning Time: 0.068 ms JIT: Functions: 6 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 1.614 ms, Inlining 0.000 ms, Optimization 0.972 ms, Emission 12.327 ms, Total 14.913 Execution Time: 1024.575 ms Now we have 10 Million rows in the table, lets try the query for wildcard that matches either first_name or last_name. CREATE EXTENSION IF NOT EXISTS "uuid-ossp" do $$ begin for r in 0.10000000 loop insert into user_details (user_uuid, first_name, last_name, email_id, security_number, phone_number) values(uuid_generate_v4(), get_random_string(8), get_random_string(7), substring( md5( random():: text), 0, 9), 'gmailcom'), upper( concat( substring( md5( random():: text), 0, 12))), cast( floor( random()*(9999999999-1000000000+1))+1000000000 as text)) end loop end $$įunction :- get_random_string (, ) CREATE FUNCTION get_random_string( IN string_length INTEGER, IN possible_chars TEXT DEFAULT 'abcdefghijklmnopqrstuvwxyz') RETURNS text LANGUAGE plpgsql AS $$ DECLARE output TEXT = '' i INT4 pos INT4 BEGIN FOR i IN 1.string_length LOOP pos := 1 + CAST( random() * ( LENGTH(possible_chars) - 1) AS INT4 ) output := output || substr(possible_chars, pos, 1) END LOOP RETURN output END $$ Our requirement is to search for name which matches from either from first_name or last_name. Insert around 10 Million records in to the table so that we could analyse the performance of search query. CREATE TABLE user_details ( id BIGSERIAL PRIMARY KEY, user_uuid UUID NOT NULL, first_name TEXT NOT NULL, last_name TEXT NOT NULL, email_id TEXT NOT NULL, security_number TEXT NOT NULL, phone_number TEXT NOT NULL, created_at TIMESTAMP NULL DEFAULT now(), updated_at TIMESTAMP NULL DEFAULT now() ) I would like to demonstrate with a sample database and data in it. The challenge we have is wildcard search is becoming costly and takes more time to fetch results.

I am working on one of my projects where Postgres happens to be the relational database and has a big table, currently had 10 Million records and can bloat up more. Performance Optimisation for Wildcards Search in Postgres (Trigram Index)