pg_parquet

pg_parquet

copy data between Postgres and Parquet

Overview

IDExtensionPackageVersionCategoryLicenseLanguage
2460
pg_parquet
pg_parquet
0.5.1
OLAP
PostgreSQL
Rust
AttributeHas BinaryHas LibraryNeed LoadHas DDLRelocatableTrusted
--sLdt-
No
Yes
Yes
Yes
no
yes
Relationships
See Also
pg_analytics
pg_duckdb
duckdb_fdw
citus_columnar
columnar
pg_mooncake
aws_s3
citus

pgrx=0.16.1, manual update from 0.16.0

Packages

TypeRepoVersionPG Major AvailabilityPackage PatternDependencies
EL
PIGSTY
0.5.1
18
17
16
15
14
13
pg_parquet_$v-
Debian
PIGSTY
0.5.1
18
17
16
15
14
13
postgresql-$v-pg-parquet-
Linux / PGPG18PG17PG16PG15PG14PG13
el8.x86_64
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
MISS
el8.aarch64
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
MISS
el9.x86_64
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
MISS
el9.aarch64
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
MISS
el10.x86_64
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
MISS
el10.aarch64
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
PIGSTY 0.5.1
MISS
d12.x86_64
MISS
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
MISS
d12.aarch64
MISS
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
MISS
d13.x86_64
MISS
MISS
MISS
MISS
MISS
MISS
d13.aarch64
MISS
MISS
MISS
MISS
MISS
MISS
u22.x86_64
MISS
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
MISS
u22.aarch64
MISS
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
MISS
u24.x86_64
MISS
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
MISS
u24.aarch64
MISS
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
PIGSTY 0.4.3
MISS
PackageVersionOSORGSIZEFile URL
pg_parquet_180.5.1el8.x86_64pigsty7.4 MiBpg_parquet_18-0.5.1-1PIGSTY.el8.x86_64.rpm
pg_parquet_180.5.1el8.aarch64pigsty6.5 MiBpg_parquet_18-0.5.1-1PIGSTY.el8.aarch64.rpm
pg_parquet_180.5.1el9.x86_64pigsty7.2 MiBpg_parquet_18-0.5.1-1PIGSTY.el9.x86_64.rpm
pg_parquet_180.5.1el9.aarch64pigsty6.8 MiBpg_parquet_18-0.5.1-1PIGSTY.el9.aarch64.rpm
pg_parquet_180.5.1el10.x86_64pigsty7.1 MiBpg_parquet_18-0.5.1-1PIGSTY.el10.x86_64.rpm
pg_parquet_180.5.1el10.aarch64pigsty6.7 MiBpg_parquet_18-0.5.1-1PIGSTY.el10.aarch64.rpm
PackageVersionOSORGSIZEFile URL
pg_parquet_170.5.1el8.x86_64pigsty7.4 MiBpg_parquet_17-0.5.1-1PIGSTY.el8.x86_64.rpm
pg_parquet_170.5.1el8.aarch64pigsty6.5 MiBpg_parquet_17-0.5.1-1PIGSTY.el8.aarch64.rpm
pg_parquet_170.5.1el9.x86_64pigsty7.2 MiBpg_parquet_17-0.5.1-1PIGSTY.el9.x86_64.rpm
pg_parquet_170.5.1el9.aarch64pigsty6.8 MiBpg_parquet_17-0.5.1-1PIGSTY.el9.aarch64.rpm
pg_parquet_170.5.1el10.x86_64pigsty7.1 MiBpg_parquet_17-0.5.1-1PIGSTY.el10.x86_64.rpm
pg_parquet_170.5.1el10.aarch64pigsty6.7 MiBpg_parquet_17-0.5.1-1PIGSTY.el10.aarch64.rpm
postgresql-17-pg-parquet0.4.3d12.x86_64pigsty5.9 MiBpostgresql-17-pg-parquet_0.4.3-1PIGSTY~bookworm_amd64.deb
postgresql-17-pg-parquet0.4.3d12.aarch64pigsty5.1 MiBpostgresql-17-pg-parquet_0.4.3-1PIGSTY~bookworm_arm64.deb
postgresql-17-pg-parquet0.4.3u22.x86_64pigsty6.5 MiBpostgresql-17-pg-parquet_0.4.3-1PIGSTY~jammy_amd64.deb
postgresql-17-pg-parquet0.4.3u22.aarch64pigsty6.1 MiBpostgresql-17-pg-parquet_0.4.3-1PIGSTY~jammy_arm64.deb
postgresql-17-pg-parquet0.4.3u24.x86_64pigsty6.5 MiBpostgresql-17-pg-parquet_0.4.3-1PIGSTY~noble_amd64.deb
postgresql-17-pg-parquet0.4.3u24.aarch64pigsty6.0 MiBpostgresql-17-pg-parquet_0.4.3-1PIGSTY~noble_arm64.deb
PackageVersionOSORGSIZEFile URL
pg_parquet_160.5.1el8.x86_64pigsty7.4 MiBpg_parquet_16-0.5.1-1PIGSTY.el8.x86_64.rpm
pg_parquet_160.5.1el8.aarch64pigsty6.5 MiBpg_parquet_16-0.5.1-1PIGSTY.el8.aarch64.rpm
pg_parquet_160.5.1el9.x86_64pigsty7.2 MiBpg_parquet_16-0.5.1-1PIGSTY.el9.x86_64.rpm
pg_parquet_160.5.1el9.aarch64pigsty6.8 MiBpg_parquet_16-0.5.1-1PIGSTY.el9.aarch64.rpm
pg_parquet_160.5.1el10.x86_64pigsty7.1 MiBpg_parquet_16-0.5.1-1PIGSTY.el10.x86_64.rpm
pg_parquet_160.5.1el10.aarch64pigsty6.7 MiBpg_parquet_16-0.5.1-1PIGSTY.el10.aarch64.rpm
postgresql-16-pg-parquet0.4.3d12.x86_64pigsty5.9 MiBpostgresql-16-pg-parquet_0.4.3-1PIGSTY~bookworm_amd64.deb
postgresql-16-pg-parquet0.4.3d12.aarch64pigsty5.1 MiBpostgresql-16-pg-parquet_0.4.3-1PIGSTY~bookworm_arm64.deb
postgresql-16-pg-parquet0.4.3u22.x86_64pigsty6.5 MiBpostgresql-16-pg-parquet_0.4.3-1PIGSTY~jammy_amd64.deb
postgresql-16-pg-parquet0.4.3u22.aarch64pigsty6.1 MiBpostgresql-16-pg-parquet_0.4.3-1PIGSTY~jammy_arm64.deb
postgresql-16-pg-parquet0.4.3u24.x86_64pigsty6.5 MiBpostgresql-16-pg-parquet_0.4.3-1PIGSTY~noble_amd64.deb
postgresql-16-pg-parquet0.4.3u24.aarch64pigsty6.0 MiBpostgresql-16-pg-parquet_0.4.3-1PIGSTY~noble_arm64.deb
PackageVersionOSORGSIZEFile URL
pg_parquet_150.5.1el8.x86_64pigsty7.4 MiBpg_parquet_15-0.5.1-1PIGSTY.el8.x86_64.rpm
pg_parquet_150.5.1el8.aarch64pigsty6.5 MiBpg_parquet_15-0.5.1-1PIGSTY.el8.aarch64.rpm
pg_parquet_150.5.1el9.x86_64pigsty7.2 MiBpg_parquet_15-0.5.1-1PIGSTY.el9.x86_64.rpm
pg_parquet_150.5.1el9.aarch64pigsty6.8 MiBpg_parquet_15-0.5.1-1PIGSTY.el9.aarch64.rpm
pg_parquet_150.5.1el10.x86_64pigsty7.1 MiBpg_parquet_15-0.5.1-1PIGSTY.el10.x86_64.rpm
pg_parquet_150.5.1el10.aarch64pigsty6.7 MiBpg_parquet_15-0.5.1-1PIGSTY.el10.aarch64.rpm
postgresql-15-pg-parquet0.4.3d12.x86_64pigsty5.9 MiBpostgresql-15-pg-parquet_0.4.3-1PIGSTY~bookworm_amd64.deb
postgresql-15-pg-parquet0.4.3d12.aarch64pigsty5.1 MiBpostgresql-15-pg-parquet_0.4.3-1PIGSTY~bookworm_arm64.deb
postgresql-15-pg-parquet0.4.3u22.x86_64pigsty6.5 MiBpostgresql-15-pg-parquet_0.4.3-1PIGSTY~jammy_amd64.deb
postgresql-15-pg-parquet0.4.3u22.aarch64pigsty6.0 MiBpostgresql-15-pg-parquet_0.4.3-1PIGSTY~jammy_arm64.deb
postgresql-15-pg-parquet0.4.3u24.x86_64pigsty6.5 MiBpostgresql-15-pg-parquet_0.4.3-1PIGSTY~noble_amd64.deb
postgresql-15-pg-parquet0.4.3u24.aarch64pigsty6.0 MiBpostgresql-15-pg-parquet_0.4.3-1PIGSTY~noble_arm64.deb
PackageVersionOSORGSIZEFile URL
pg_parquet_140.5.1el8.x86_64pigsty7.4 MiBpg_parquet_14-0.5.1-1PIGSTY.el8.x86_64.rpm
pg_parquet_140.5.1el8.aarch64pigsty6.5 MiBpg_parquet_14-0.5.1-1PIGSTY.el8.aarch64.rpm
pg_parquet_140.5.1el9.x86_64pigsty7.2 MiBpg_parquet_14-0.5.1-1PIGSTY.el9.x86_64.rpm
pg_parquet_140.5.1el9.aarch64pigsty6.8 MiBpg_parquet_14-0.5.1-1PIGSTY.el9.aarch64.rpm
pg_parquet_140.5.1el10.x86_64pigsty7.1 MiBpg_parquet_14-0.5.1-1PIGSTY.el10.x86_64.rpm
pg_parquet_140.5.1el10.aarch64pigsty6.7 MiBpg_parquet_14-0.5.1-1PIGSTY.el10.aarch64.rpm
postgresql-14-pg-parquet0.4.3d12.x86_64pigsty5.9 MiBpostgresql-14-pg-parquet_0.4.3-1PIGSTY~bookworm_amd64.deb
postgresql-14-pg-parquet0.4.3d12.aarch64pigsty5.1 MiBpostgresql-14-pg-parquet_0.4.3-1PIGSTY~bookworm_arm64.deb
postgresql-14-pg-parquet0.4.3u22.x86_64pigsty6.5 MiBpostgresql-14-pg-parquet_0.4.3-1PIGSTY~jammy_amd64.deb
postgresql-14-pg-parquet0.4.3u22.aarch64pigsty6.0 MiBpostgresql-14-pg-parquet_0.4.3-1PIGSTY~jammy_arm64.deb
postgresql-14-pg-parquet0.4.3u24.x86_64pigsty6.5 MiBpostgresql-14-pg-parquet_0.4.3-1PIGSTY~noble_amd64.deb
postgresql-14-pg-parquet0.4.3u24.aarch64pigsty6.0 MiBpostgresql-14-pg-parquet_0.4.3-1PIGSTY~noble_arm64.deb

Source

pig build get pg_parquet; # get pg_parquet source code
pig build dep pg_parquet; # install build dependencies
pig build pkg pg_parquet; # build extension rpm or deb
pig build ext pg_parquet; # build extension rpms

Install

To add the required PGDG / PIGSTY upstream repository, use:

pig repo add pgsql -u   # add PGDG + Pigsty repo and update cache (leave existing repos)

Install this extension with:

pig ext install pg_parquet; # install by extension name, for the current active PG version
pig ext install pg_parquet; # install via package alias, for the active PG version
pig ext install pg_parquet -v 18;   # install for PG 18
pig ext install pg_parquet -v 17;   # install for PG 17
pig ext install pg_parquet -v 16;   # install for PG 16
pig ext install pg_parquet -v 15;   # install for PG 15
pig ext install pg_parquet -v 14;   # install for PG 14

Create this extension with:

CREATE EXTENSION pg_parquet;

Usage

There are mainly 3 things that you can do with pg_parquet:

  1. You can export Postgres tables/queries to Parquet files,
  2. You can ingest data from Parquet files to Postgres tables,
  3. You can inspect the schema and metadata of Parquet files.

COPY to/from Parquet files from/to Postgres tables

You can use PostgreSQL’s COPY command to read and write from/to Parquet files. Below is an example of how to write a PostgreSQL table, with complex types, into a Parquet file and then to read the Parquet file content back into the same table.

-- create composite types
CREATE TYPE product_item AS (id INT, name TEXT, price float4);
CREATE TYPE product AS (id INT, name TEXT, items product_item[]);

-- create a table with complex types
CREATE TABLE product_example (
    id int,
    product product,
    products product[],
    created_at TIMESTAMP,
    updated_at TIMESTAMPTZ
);

-- insert some rows into the table
insert into product_example values (
    1,
    ROW(1, 'product 1', ARRAY[ROW(1, 'item 1', 1.0), ROW(2, 'item 2', 2.0), NULL]::product_item[])::product,
    ARRAY[ROW(1, NULL, NULL)::product, NULL],
    now(),
    '2022-05-01 12:00:00-04'
);

-- copy the table to a parquet file
COPY product_example TO '/tmp/product_example.parquet' (format 'parquet', compression 'gzip');

-- show table
SELECT * FROM product_example;

-- copy the parquet file to the table
COPY product_example FROM '/tmp/product_example.parquet';

-- show table
SELECT * FROM product_example;

You can also use COPY command to read and write Parquet stream from/to standard input and output. Below is an example usage (you have to specify format = parquet):

psql -d pg_parquet -p 28817 -h localhost -c "create table product_example_reconstructed (like product_example);"
 CREATE TABLE

psql -d pg_parquet -p 28817 -h localhost -c "copy product_example to stdout (format parquet);" | psql -d pg_parquet -p 28817 -h localhost -c "copy product_example_reconstructed from stdin (format parquet);"
 COPY 2

Inspect Parquet schema

You can call SELECT * FROM parquet.schema(<uri>) to discover the schema of the Parquet file at given uri.

SELECT * FROM parquet.schema('/tmp/product_example.parquet') LIMIT 10;
             uri              |     name     | type_name  | type_length | repetition_type | num_children | converted_type | scale | precision | field_id | logical_type 
------------------------------+--------------+------------+-------------+-----------------+--------------+----------------+-------+-----------+----------+--------------
 /tmp/product_example.parquet | arrow_schema |            |             |                 |            5 |                |       |           |          | 
 /tmp/product_example.parquet | id           | INT32      |             | OPTIONAL        |              |                |       |           |        0 | 
 /tmp/product_example.parquet | product      |            |             | OPTIONAL        |            3 |                |       |           |        1 | 
 /tmp/product_example.parquet | id           | INT32      |             | OPTIONAL        |              |                |       |           |        2 | 
 /tmp/product_example.parquet | name         | BYTE_ARRAY |             | OPTIONAL        |              | UTF8           |       |           |        3 | STRING
 /tmp/product_example.parquet | items        |            |             | OPTIONAL        |            1 | LIST           |       |           |        4 | LIST
 /tmp/product_example.parquet | list         |            |             | REPEATED        |            1 |                |       |           |          | 
 /tmp/product_example.parquet | element        |            |             | OPTIONAL        |            3 |                |       |           |        5 | 
 /tmp/product_example.parquet | id           | INT32      |             | OPTIONAL        |              |                |       |           |        6 | 
 /tmp/product_example.parquet | name         | BYTE_ARRAY |             | OPTIONAL        |              | UTF8           |       |           |        7 | STRING
(10 rows)

Inspect Parquet metadata

You can call SELECT * FROM parquet.metadata(<uri>) to discover the detailed metadata of the Parquet file, such as column statistics, at given uri.

SELECT uri, row_group_id, row_group_num_rows, row_group_num_columns, row_group_bytes, column_id, file_offset, num_values, path_in_schema, type_name FROM parquet.metadata('/tmp/product_example.parquet') LIMIT 1;
             uri              | row_group_id | row_group_num_rows | row_group_num_columns | row_group_bytes | column_id | file_offset | num_values | path_in_schema | type_name 
------------------------------+--------------+--------------------+-----------------------+-----------------+-----------+-------------+------------+----------------+-----------
 /tmp/product_example.parquet |            0 |                  1 |                    13 |             842 |         0 |           0 |          1 | id             | INT32
(1 row)
SELECT stats_null_count, stats_distinct_count, stats_min, stats_max, compression, encodings, index_page_offset, dictionary_page_offset, data_page_offset, total_compressed_size, total_uncompressed_size FROM parquet.metadata('/tmp/product_example.parquet') LIMIT 1;
 stats_null_count | stats_distinct_count | stats_min | stats_max |    compression     |        encodings         | index_page_offset | dictionary_page_offset | data_page_offset | total_compressed_size | total_uncompressed_size 
------------------+----------------------+-----------+-----------+--------------------+--------------------------+-------------------+------------------------+------------------+-----------------------+-------------------------
                0 |                      | 1         | 1         | GZIP(GzipLevel(6)) | PLAIN,RLE,RLE_DICTIONARY |                   |                      4 |               42 |                   101 |                      61
(1 row)

You can call SELECT * FROM parquet.file_metadata(<uri>) to discover file level metadata of the Parquet file, such as format version, at given uri.

SELECT * FROM parquet.file_metadata('/tmp/product_example.parquet')
             uri              | created_by | num_rows | num_row_groups | format_version 
------------------------------+------------+----------+----------------+----------------
 /tmp/product_example.parquet | pg_parquet |        1 |              1 | 1
(1 row)

You can call SELECT * FROM parquet.kv_metadata(<uri>) to query custom key-value metadata of the Parquet file at given uri.

SELECT uri, encode(key, 'escape') as key, encode(value, 'escape') as value FROM parquet.kv_metadata('/tmp/product_example.parquet');
             uri              |     key      |    value
------------------------------+--------------+---------------------
 /tmp/product_example.parquet | ARROW:schema | /////5gIAAAQAAAA ...
(1 row)

Inspect Parquet column statistics

You can call SELECT * FROM parquet.column_stats(<uri>) to discover the column statistics of the Parquet file, such as min and max value for the column, at given uri.

SELECT * FROM parquet.column_stats('/tmp/product_example.parquet')
 column_id | field_id |         stats_min          |         stats_max          | stats_null_count | stats_distinct_count 
-----------+----------+----------------------------+----------------------------+------------------+----------------------
         4 |        7 | item 1                     | item 2                     |                1 |                     
         6 |       11 | 1                          | 1                          |                1 |                     
         7 |       12 |                            |                            |                2 |                     
        10 |       17 |                            |                            |                2 |                     
         0 |        0 | 1                          | 1                          |                0 |                     
        11 |       18 | 2025-03-11 14:01:22.045739 | 2025-03-11 14:01:22.045739 |                0 |                     
         3 |        6 | 1                          | 2                          |                1 |                     
        12 |       19 | 2022-05-01 19:00:00+03     | 2022-05-01 19:00:00+03     |                0 |                     
         8 |       15 |                            |                            |                2 |                     
         5 |        8 | 1                          | 2                          |                1 |                     
         9 |       16 |                            |                            |                2 |                     
         1 |        2 | 1                          | 1                          |                0 |                     
         2 |        3 | product 1                  | product 1                  |                0 |                     
(13 rows)
Last updated on