- 
                Notifications
    
You must be signed in to change notification settings  - Fork 9
 
Open
Description
Hi,
I'm testing gpq on the official administrative boundaries of Italy. The source file is this zip file:
https://www.istat.it/storage/cartografia/confini_amministrativi/non_generalizzati/2023/Limiti01012023.zip
It has a folder structure, with shapefiles in it. I am doing the tests on the Limiti01012023/Com01012023/Com01012023_WGS84.shp file:
- I convert it to geojson using ogr2ogr;
 - using this geojson I create a gzip compressed geoparquet file, it has the size of 70 MB
 - using the same geojson I create an uncompressed geoparquet file, it has the size of 76 MB
 
They are almost equal in size. Some notes:
- if I gzip the uncompressed parquet file I get a 57 MB file
 - if I create a sozip shp version of the source file, I get a 59 MB file
 
I know, I can't compare these outputs, however, it seems to me very limited compression in gpq output. Is it normal?
Am I doing something wrong?
Below the way I have tested all.
Thank you
wget -O file.zip "https://www.istat.it/storage/cartografia/confini_amministrativi/non_generalizzati/2023/Limiti01012023.zip"
unzip -o file.zip -d .
ogr2ogr -f GeoJSON -t_srs EPSG:4326 comuni.geojson Limiti01012023/Com01012023/Com01012023_WGS84.shp -lco "RFC7946=YES"
gpq convert --compression="gzip" --max 1000 --from="geojson" comuni.geojson comuni_compressed.parquet
gpq convert --compression="uncompressed" --max 1000 --from="geojson" comuni.geojson comuni_uncompressed.parquet
ogr2ogr -t_srs EPSG:4326 Com01012023_WGS84.shp.zip Limiti01012023/Com01012023/Com01012023_WGS84.shpMetadata
Metadata
Assignees
Labels
No labels