The First Cry of Atom

Ensure Query Result Consistency with TinyPresto

Ensuring the query result consistency is a vital part of providing reliable service on top of the SQL execution engine. The business process of many users may depend on the findings and insight derived from the observation. If the result is wrong, their business should also go wrong. It’s is a critical problem.

To ensure the query result consistency of Presto version to version, I have created a Gem package named tiny-presto. The reason why I chose Ruby is that our application is built with Ruby on Rails. Thus it enables us to run any Presto SQLs and check the result on the unit test level. Under the hood, tiny-presto uses Docker container for Presto with the specific version. Since the prestosql community officially distributes it, we can make sure to verify the query result on the application side.

tiny-presto is a small library to run SQL on one node Presto cluster. It is pretty easy to use.

Table Of Contents

  1. How to use tiny-presto
  2. Run query
  3. Supported catalogs

How to use tiny-presto

Please make sure to install Docker engine first so that the library download and run Docker containers. See how to install docker

Next, let’s install the library.

$ gem install tiny-presto

Run query

Only one line of code allows you to run a query and get the result.

require 'tiny-presto'
rows = TinyPresto.run('show schemas')
# => [["default"], ["information_schema"]]

TinyPresto may fail to stop the cluster even after you have finished running queries. ensure_stop lets cluster terminate all Docker containers launched by tiny-presto.

TinyPresto.ensure_stop

That’s it!. It’s so simple to use the library.

Supported catalogs

tiny-presto uses the Docker image distributed by the prestosql community. It supports the following connectors.

Among them, only a memory connector permits the table to write. tiny-presto uses memory connector as default. If you want to use the different catalogs, you can launch the server and client separately.

# Crete a cluster listening the localhost with 8080.
cluster = TinyPresto::Cluster.new('localhost')

# Start running
container = cluster.run

require 'presto-client'
# Setup preto-client-ruby to use 'tpch' catalog.
client = Presto::Client.new(server: 'localhost:8080', catalog: 'tpch', user: 'tiny-user')

client.run('show schemas')

cluster.stop

Please take a look at treasure-data/presto-client-ruby for more detail of the client library.

And as usual, welcome any feedback or patches!