What is Presto?
Fast and reliable SQL query engine for data analytics and the open lakehouse
Key Innovation
Some of the biggest companies in the world are contributing to the Presto open source project. These key innovations are only available in Linux Foundation Presto today.
Caching with RaptorX
Disaggregate storage from compute for low latency to provide a unified, cheap, fast and scalable solution to OLAP and interactive use cases.
Blog | Presentation
ETL with Presto-on-Spark
Presto on Spark is an integration between Presto and Spark that leverages Presto’s compiler/evaluation as a library and Spark’s large scale processing capabilities. It enables a unified SQL experience between interactive and batch use cases
Docs
User Defined Functions
Support for dynamic SQL functions (available in experimental mode)
Docs
Why Presto?
One Language
Different engines for different workloads means you will have to re-platform down the road.
With Presto, you get 1 familiar ANSI SQL language and 1 engine for your data analytics so you don’t need to graduate to another lakehouse engine. Presto can be used for interactive and batch workloads, small and large amounts of data, and scales from a few to thousands of users.
One Interface
Most data teams have different engines for different workloads on their data lake storage, and each engine has its own language and interface.
Presto gives you one simple ANSI SQL interface for all of your data in various siloed data systems, helping you join your data ecosystem together. Presto’s connector architecture enables you to query data where it lives.
Fast, Reliable & Efficient
Data infrastructure costs can explode, especially with proprietary systems like data warehouses, as the data size and users workloads grow.
Presto is battle-tested at Meta and Uber and can scale to meet growing data sizes and workloads. It’s faster and more efficient than other engines because it’s optimized for large numbers of small queries, so you can query data at better price-performance compared to proprietary systems.
Use Cases
Ad-hoc Query
Use SQL to run ad hoc queries whenever you want, wherever your data resides. Presto allows you to query data where it’s stored so you don’t have to ETL data into a separate system.
Reporting and dashboarding
Query data across multiple sources to build one Presto view of reports and dashboards for Presto self-service BI business intelligence.
Open Lakehouse
Through one interface, Presto acts as more than just a query engine as it sits at the core of your data ecosystem, helping to tie it all together by solving data problems at scale.