Igor Dobryn
about IT

Arel::Nodes::ValuesList to build a value table

We often use VALUES in SQL queries. Either to build a value table or for optimizations. It becomes useful
working with different data sources when there is a need to use some data as it’d be in the database table or to improve query performance.

 

Value tables

We use value tables for different reasons: JOINs, UPDATEs, custom SELECTs. PostgreSQL (as the most of relational databases) has a naturally simple way to SELECT from the value list.

Our typical dataset can look as follows:

columns = ['key1', 'key2']
data = [
  ['val11', 'val12'], ['val21', 'val22']
]

 
Hence we can compose a query:

SELECT * FROM (VALUES ('val11', 'val12'), ('val21', 'val22')) AS t (key1, key2);

 
A straightforward decision would be to interpolate data into the query:

values = data.map do |row|
  "(#{row.map { |item| ActiveRecord::Base.connection.quote(item) }.join(',') })"
end
values.join(',')

 
It looks messy and cumbersome, doesn't it? Eventually, it should evolve into something more readable. And here comes Arel. There is Arel::Nodes::ValuesList to serialize data into SQL value list:

Arel::Nodes::ValuesList.new([['val11', 'val12'], ['val21', 'val22']]).to_sql
=> "VALUES ('val11', 'val12'), ('val21', 'val22')"

 
So, we can build a query:

value_list = Arel::Nodes::ValuesList.new([['val11', 'val12'], ['val21', 'val22']]).to_sql
column_list = columns.join(', ')
query = "SELECT * FROM (#{value_list}) AS t (#{column_list})"

 
Did it become more clear?

 
And in the real world:

columns = %i[id first_name last_name]
students = [[1, 'John', 'Doe'], [2, 'John', 'Smith']]
encrypted_students = students.map { |id, first_name, last_name| [id, encrypt(first_name), encrypt(last_name)] } 
value_list = Arel::Nodes::ValuesList.new(encrypted_students).to_sql
column_list = columns.join(', ')

connection.execute <<~SQL
  UPDATE students
  SET encrypted_first_name = value_table.encrypted_first_name,
      encrypted_last_name = value_table.encrypted_last_name 
  FROM (#{value_list}) AS value_table(#{column_list})
  WHERE students.id = value_table.id
SQL

 

Performance optimizations

  1. At some point WHERE query with big IN condition becomes inefficient and replacing it by value list improves performance.
  2. Similar one but with filtering by inclusion in ARRAY. Replacing ARRAY[tons of items] by VALUES (tons of items) might give significant improvement:
WHERE (name = ANY(ARRAY['name1', 'name2']))
// vs
WHERE (name = ANY(VALUES('name1'), ('name2')))

 
Hence Arel::Nodes::ValuesList works great in both cases:

names = %w[name1 name2]
value_list = Arel::Nodes::ValuesList.new(names.map { |name| [name] })
Student.where("name IN (#{value_list.to_sql})")
# or
Student.where("name = ANY (#{value_list.to_sql})")

 
 

Conclusions

There are several use-cases when value list becomes particularly useful for us. Dealing with different data sources we need to use data as it's in an SQL table. We often update database records from the value table constructed in high-level Ruby code. Handling this on the level of the ActiveRecord record would be inefficient producing too many extra queries. Another common use-case is a performance optimization for huge IN/ARRAY conditions. Arel is a great fit for this and suits us tiptop:

  • it handles chars escaping
  • it's more clear
Rails ActiveRecord Arel SQL value table