IMPORT DATA FROM PARQUET FILES

Apache Parquet is a data file format that organizes data by columns, and it supports several compression methods for the data to achieve efficient storage. Now you may use import parquet to import data from a Parquet file into Stataimport parquet reads a Parquet file into an Apache Arrow table and then converts the table to a Stata dataset. Most Parquet data types and compression methods are supported. This feature is a part of StataNow™.


We first look at the information contained in the Parquet file iris.parquet.

 

Now we import the Parquet data into Stata.

 

. import parquet using iris.parquet
(5 vars, 150 obs)

© Copyright 1996–2026 StataCorp LLC. All rights reserved.

We can also import a subset of columns. For example, below we want to import only columns sepal.lengthsepal.width, and variety into Stata.

 

. import parquet sepal.length sepal.width variety using iris.parquet, clear
(3 vars, 150 obs)

 

Additionally, we can import a subset of rows from the Parquet file; below, we only import the last 100 rows:

 

. import parquet sepal.length sepal.width variety using iris.parquet, rowrange(-100:L) clear
(3 vars, 100 obs)