avatar

32: Trino Tardigrade: Try, try, and never die

Trino Community Broadcast
Trino Community Broadcast
Episode • Feb 18, 2022 • 1h 31m

While Trino has been proven to run batch analytic workloads at scale, many have avoided long-running batch jobs in fear of query failure. Join this month's broadcast discussing the project introducing granular fault-tolerance to Trino.

Codenamed Project Tardigrade, it is being thoughtfully crafted to maintain the speed advantage that Trino has over other query engines while increasing the resiliency of queries. We will discuss some of the design proposals being considered with Tardigrade engineers Andrii, Zebing, Lukasz Osipiuk, and Martin. We'll also cover how fault-tolerance will be exposed to users, and we will do a demo to showcase retries.

Project Tardigrade is named after the microscopic Tardigrades that are the world's most indestructible creatures, akin to the resiliency we are adding to Trino’s queries. We look forward to telling you more as features unfold.

- Intro Song: 00:00​

- Intro: 00:34

- News: 7:56

- Concept of the month: Introducing Project Tardigrade: 20:26

- Concept of the month: Why ETL in Trino?: 22:57

- Concept of the month: Why are people reluctant to do their ETL in Trino?: 35:28

- Concept of the month: What are the limitations of the current architecture?: 43:25

- Concept of the month: Trino engine improvements with Project Tardigrade: 52:59

- Demo of the month: Task retries with Project Tardigrade: 1:19:10

- PR of the month: PR 10319 Trino lineage fails for AliasedRelation: 1:21:57

- Question of the month: How do you cast JSON to varchar with Trino?: 1:24:38

Show Notes: https://trino.io/episodes/32.html

Show Page: https://trino.io/broadcast/