Skip to content

lukehedger/iceberg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Iceberg on S3 🧊

A set of SQL queries for getting started with Apache Iceberg tables on S3 with Athena.

Athena

Run the queries in this order:

  1. create.sql
  2. insert.sql
  3. select.sql
  4. update.sql
  5. time_travel.sql

Iceberg Table Anatomy

Image

An Iceberg table consists of three main layers:

  • Iceberg catalog - Query engines use the catalog to find the current snapshot version of the table, either to read or to write data. For example, Glue.
  • Metadata layer - Manifest files and manifest list files keep track of information such as the schema of the tables, the partition strategy, and the location of the data files. Stored in S3.
  • Data layer - The files that have the data records that queries will run against. Stored in different formats, including Apache Parquet.

About

Iceberg on S3 and Athena

Resources

Stars

Watchers

Forks