Ensure Query Result Consistency with TinyPresto29 Jan 2020
Ensuring the query result consistency is a vital part of providing reliable service on top of the SQL execution engine. The business process of many users may depend on the findings and insight derived from the observation. If the result is wrong, their business should also go wrong. It’s is a critical problem.
To ensure the query result consistency of Presto version to version, I have created a Gem package named
tiny-presto. The reason why I chose Ruby is that our application is built with Ruby on Rails. Thus it enables us to run any Presto SQLs and check the result on the unit test level. Under the hood, tiny-presto uses Docker container for Presto with the specific version. Since the prestosql community officially distributes it, we can make sure to verify the query result on the application side.
tiny-presto is a small library to run SQL on one node Presto cluster. It is pretty easy to use.
Table Of Contents
- How to use tiny-presto
- Run query
- Supported catalogs
How to use tiny-presto
Please make sure to install Docker engine first so that the library download and run Docker containers. See how to install docker
Next, let’s install the library.
$ gem install tiny-presto
Only one line of code allows you to run a query and get the result.
require 'tiny-presto' rows = TinyPresto.run('show schemas') # => [["default"], ["information_schema"]]
TinyPresto may fail to stop the cluster even after you have finished running queries.
ensure_stop lets cluster terminate all Docker containers launched by tiny-presto.
That’s it!. It’s so simple to use the library.
tiny-presto uses the Docker image distributed by the prestosql community. It supports the following connectors.
Among them, only a memory connector permits the table to write. tiny-presto uses memory connector as default. If you want to use the different catalogs, you can launch the server and client separately.
# Crete a cluster listening the localhost with 8080. cluster = TinyPresto::Cluster.new('localhost') # Start running container = cluster.run require 'presto-client' # Setup preto-client-ruby to use 'tpch' catalog. client = Presto::Client.new(server: 'localhost:8080', catalog: 'tpch', user: 'tiny-user') client.run('show schemas') cluster.stop
Please take a look at treasure-data/presto-client-ruby for more detail of the client library.
And as usual, welcome any feedback or patches!