Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look in to changing how array.includes() criteria method works #968

Open
jwoertink opened this issue Sep 7, 2023 · 3 comments
Open

Look in to changing how array.includes() criteria method works #968

jwoertink opened this issue Sep 7, 2023 · 3 comments

Comments

@jwoertink
Copy link
Member

# WHERE `value` = ANY(column)
def includes(value) : T
value = V.adapter.to_db!(value)
add_clause(Avram::Where::Includes.new(column, value))
end

Using this method in a query is handy. If one of your columns is an Array(?) column, you can get records where that array contains some value

# SELECT * FROM users WHERE 'lucky' = ANY(tags)
UseryQuery.new.tags.includes("lucky")

However, I just learned that if you put a GIN index on tags, postgres won't actually use the index in this way...

Using a GIN index

=# CREATE INDEX users_tags_index ON users USING gin (tags);

Query as-is now

=# EXPLAIN SELECT * FROM users WHERE 'lucky' = ANY(tags);
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Seq Scan on users  (cost=0.00..5748.84 rows=208 width=10)
   Filter: ('lucky'::text = ANY (tags))
(2 rows)

=# SELECT * FROM users WHERE 'lucky' = ANY(tags);
Time: 8.938 ms

But there seems to be an alternate way of doing this query which is a ton faster

=# EXPLAIN SELECT * FROM users WHERE tags @> '{"lucky"}';
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 Bitmap Heap Scan on users  (cost=9.61..697.01 rows=208 width=10)
   Recheck Cond: (tags @> '{lucky}'::text[])
   ->  Bitmap Index Scan on users_tags_index  (cost=0.00..9.56 rows=208 width=0)
         Index Cond: (tags @> '{lucky}'::text[])
(4 rows)

=# SELECT * FROM users WHERE tags @> '{"lucky"}';
Time: 0.533 ms

This is a pretty big improvement. Though, I should note that without an index, the first way queries twice as fast as the second way...

=# SELECT * FROM users WHERE 'lucky' = ANY(tags);
Time: 10.897 ms
=# SELECT * FROM users WHERE tags @> '{"lucky"}';
Time: 23.410 ms
@robacarp
Copy link
Contributor

robacarp commented Sep 7, 2023

I think the best case here is that avram just constructs the database with the GIN index and then queries it accordingly. For the backwards incompatibility, for when an avram app/database which has been constructed before that change, or for a database that exists outside of Avrams control, perhaps an opt-in model-level settings interface would be helpful?

Ecto accomplishes this sort of thing with compile-time attributes on the models, kind of like annotations. For example you can configure your application to prefer UUID style ID columns, but then override it on a per-model basis if you want.

@jwoertink
Copy link
Member Author

I like the idea of adding some annotations to sort of configure your setup. How would that work here?

Maybe something in your AppDatabase? 🤔

@[Avram::IncludesCriteria(use_any: false)]
class AppDatabase < Avram::Database
end

@jwoertink
Copy link
Member Author

Just merged in the @> operator #1016 but this was meant for Jsonb. However, the operator could still apply here. We could just take this JsonbIncludes and make it the new Includes, but there's 2 things to consider

  • This would mean = ANY() would not be built-in anymore
  • This would instantly cause a performance hit for anyone using array includes until they added a GIN index to their app.

The second one isn't too bad.. If you're upgrading you're most likely reading CHANGELOG and stuff anyway, and that could easily be mentioned as a "don't forget to....". The first concern though we'd either have to come up with a new name for it, or maybe an option to the method? tags.includes("lucky", equal_any: true) tags.any("lucky") ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants