• Import and export data from Excel .xls and .xlsx files
  • Import and export CSV and delimited data
  • Copy/paste data from spreadsheets
  • Input data in spreadsheet editor
  • Read from and write to SQL sources with ODBC (see below)
  • Import and export fixed-format data using a dictionary
  • Import and export any type of text data
  • Unicode (UTF-8) support, including conversion from/to extended ASCII
  • Import EBCDIC data and convert EBCDIC to ASCII
  • Import SAS files
  • Import SPSS files
  • Import and export data in the format required by the FDA for NDA submittals
  • Import and export SAS Transport XPORT files
  • Import Federal Reserve Economic Data
  • Import from Haver Analytics databases
  • Import data from Wharton Research Data Services (WRDS) via JDBC
  • Import and export dBase files
  • High-level import/export of full Excel worksheets
  • Low-level cell-by-cell access to write results to and read data from Excel, including graphs, formulas, date formats, currency formats, bold, italics, and more
  • Export data to SPSS New



  • Import data from Oracle, Microsoft SQL Server, MySQL, Amazon Redshift, Snowflake, and other databases
  • Export data to an existing database table
  • Execute SQL statements on a database
  • Create data source names to store connection settings
  • Support for CLOBs, BLOBs, and Unicode
  • Import data using GUI New


  • Import data from any ODBC data source, such as Oracle, SQL Server, Access, Excel, MySQL, and DB2
  • Export data to new or existing ODBC tables
  • Execute custom SQL commands individually or in batches
  • Customize ODBC connection strings
  • Support for ODBC
  • Support for VARCHARs/CLOBs and BLOBs
  • Support for Unicode


  • Clipboard Preview Tool lets you control how data will be pasted
  • Manage variables with the Variables Tool
  • For Windows, Mac, and Unix
  • Pinnable rows and columns New
  • Resizable cell editor for string data New
  • Tool tips for truncated text New
  • Proportional width font supportNew
  • Columns can be resized and are preserved when saving the dataset
  • Show variable labels in column header New
  • Keyboard shortcut for hiding and showing value labels New



  • Manage variables
  • Manage dataset properties
  • For Windows , Mac, and Unix


  • Change storage types, names, and formats
  • Add and edit value labels
  • Attach notes to variables
  • Filter variables
  • For Windows, Mac, and Unix



  • Statistical functions
  • Mathematical functions
  • Trigonometric functions
  • String functions
  • Unicode functions
  • Regular expressions
    • Advanced regular expression functions
  • Date and time functions
    • Durations, relative dates, datetime components
    • Week-related functions
  • Time-series functions
  • Random-number functions
    • 18 functions
    • Stream random numbers
  • Matrix functions
  • Programming functions



  • Row–column transposition
  • Data reshaping
  • Stacking of variables
  • Collapsing into means, totals, etc.



  • UTF-8
  • Translation of extended ASCII to UTF-8
  • Unicode-aware string functions
  • Locale-based sorting and string comparison
  • Dataset labels
  • Variable labels
  • Value labels (e.g., male and female for 0 and 1)


© Copyright 1996–2023 StataCorp LLC. All rights reserved.

  • Ability to switch between multiple sets of data, variable, and value labels
  • Missing-value labels
  • Support for multiple languages, including Unicode support



  • Extensive notes can be attached to a dataset

Data snapshots

  • Allow multiple levels of undo to modified datasets

Multiple datasets in memory (frames)

  • Link frames
  • Copy data between frames
  • Access data in other frames
  • Post simulation results to frame
  • Manipulate frames
  • Access frames from Mata
  • Access frames from Mata
  • Save, load, describe multiple frames
Read The Stata Blog: Fun with frames.

Automatic memory management

  • Terabytes of RAM supported
  • Up to 120,000 variables in Stata/MP; up to 32,767 variables
    in Stata/SE
  • 20 billion or more observations in Stata/MP
  • Up to 2.1 billion observations in Stata/SE and Stata/BE


  • Ascending or descending sorts
  • Multiple-key sorts
  • Numeric and string sorts
  • Locale-aware Unicode string sorting and comparison

Combining datasets

  • Merge datasets
    • By key variables
    • By observations
  • Join datasets
  • Outer join
  • Append datasets
  • Append time series

Special datasets

  • Longitudinal data/panel data
  • Survival/duration data
  • Time series
  • Survey data
  • Multiple imputations
  • Discrete choice data
  • Spatial data


  • Count number of observations that satisfy specified conditions
  • Formatted and unformatted disk I/O
  • Zip-file support
  • Unicode conversion from/to extended ASCII
  • Custom filters to manipulate text files

Variable management

  • Generation of new variables
  • Replacement of existing variables
  • Renaming variables
  • Encoding and decoding string variables
  • Reordering variables in dataset
  • Variables Manager

Dataset utilities

  • Flexible description of variables, labels, and types
  • List values of variables
  • Data signatures to verify the integrity of datasets
  • Codebooks for variables
  • Value-label reports
  • Duplicates and missing values tables
  • Compress (make dataset as small as possible without loss of accuracy)


Variable types

  • Numeric storage types
    • Byte
    • Integer (int)
    • Long
    • Float
    • Double
  • String (including Unicode, very long strings and BLOBs)
  • Dates and times
  • Business calendars

Long string support

  • Up to 2 billion character long strings
  • Coalescing of duplicate values to save memory
  • Binary ‘strings’ (BLOBs)
  • Import and export entire files into long strings/BLOBs
  • Unicode (UTF-8) strings

Stored results

  • Save results to disk for later use
  • Store estimation results in memory
  • Create tables to compare results
  • Create custom tables