PURPOSE

 

Loads data from a dataset. The supported dataset types are CSV, Excel (xlsx, xlsx), HDF5, GAUSS Matrix (fmt), GAUSS Dataset (dat), Stata (dta), and SAS (sas7bdat, sas7bcat). Existing dataframes are also supported.

 

FORMAT

 

 

Parameters:

dataset (string or existing dataframe) –

 

filepath to the dataset on disk, URL, or existing dataframe. If the a URL is provided (with http or https schema), the dataset will be downloaded first. Since libcurl is used for all web operations, various proxy settings can be set using the relevant libcurl environment variables (see https://curl.haxx.se/libcurl/c/CURLOPT_PROXY.html).

 

varnames (string) –

 

Formula string indicating which variable names to load from the dataset

E.g ".", include all variables;

E.g "Income + Limit ", include "Income" and "Limit";

E.g ". - Cards", include all variables except for "Cards".

 

Returns:

 

y (NxK matrix) – data.

 

EXAMPLES

 

Load all contents of a GAUSS dataset

 

 

After the above code, the following ouptut should be printed to the Command window.

 

 

Load specified variables from a dataset

 

 

After the above code,

 

All variables:

14.891    3606.00    283.00    2.0000    34.000    11.000    1.0000    1.0000    2.0000    3.0000    333.000
106.03    6645.00    483.00    3.0000    82.000    15.000    2.0000    2.0000    2.0000    2.0000    903.000
104.59    7075.00    514.00    4.0000    71.000    11.000    1.0000    1.0000    1.0000    2.0000    580.000

Balance and Limit:

333.000      3606.00
903.000      6645.00
580.000      7075.00

All except Cards:

14.8910    3606.00    283.00    34.000    11.000    1.0000    1.0000    2.0000    3.0000    333.000
106.025    6645.00    483.00    82.000    15.000    2.0000    2.0000    2.0000    2.0000    903.000
104.593    7075.00    514.00    71.000    11.000    1.0000    1.0000    1.0000    2.0000    580.000

Load all columns of a GAUSS matrix file, .fmt

 

No variable names are stored in .fmt files. GAUSS allows the use of X1, X2, X2...XP to reference variables in a .fmt file.

 

 

Load specified columns of a GAUSS matrix file, .fmt.

 

Load three specified variables from a SAS dataset, .sas7bdat.

 

 

After the above code,

 

 

Load a string date from a .csv file and automatically convert it to a POSIX date/time (seconds since Jan 1, 1970).

 

 

After the above code,

 

 

Remarks

 

  • Since loadd() will load the entire dataset at once, the dataset must be small enough to fit in memory. To read chunks of a dataset in an iterative manner, use dataopen() and readr().

  • If dataset is a null string or 0, the dataset temp.dat will be loaded.
  • To load a matrix file, use an .fmt extension on dataset.
  • The supported dataset types are CSVExcel (XLS, XLSX), HDF5GAUSS Matrix (FMT)GAUSS Dataset (DAT)Stata (DTA) and SAS (SAS7BDAT, SAS7BCAT).
  • For HDF5 file, the dataset must include schema and both file name and dataset name must be provided, e.g.