Connecting to databases using JDBC

Connecting Stata with databases has gotten even easier. jdbc allows us to exchange data with some of the most popular database vendors such as Oracle, MySQL, Amazon Redshift, Snowflake, Microsoft SQL Server, and much more. What’s great about jdbc is that it’s a cross-platform solution, so our JDBC setup works the same way for Windows, Mac, and Unix systems. Once you install a JDBC driver, that driver and your Stata code are all you need to switch from, say, your Mac laptop to your company’s Windows cloud systems.

Let’s see it work

We have email data stored on Amazon Web Services in a Redshift cluster, and we need to load these data into Stata. We first log in to AWS and go to the Amazon Redshift configuration page to download the correct JDBC driver and get the correct connection information. We then place the downloaded JDBC JAR file along our Stata adopath. Now in the Stata Do-file Editor, we store our connection information by typing

. local jar "redshift-jdbc42-2.0.0.0.jar"
. local driverc "com.amazon.redshift.jdbc42.Driver"
. local url "jdbc:redshift://redshift-cluster-1.cziajbxjzi3e.us-west-2.redshift.amazonaws.com:5439/emails"
. local user "admin"
. local pass "secret"

. jdbc connect,  jar("`jar'") driverclass("`driverc'") url("`url'")
        user("`user'") password("`pass'")

If these database settings need to be used by others or you just want to make remembering them easier, we can store them by typing

. local jar "redshift-jdbc42-2.0.0.0.jar"
. local driverc "com.amazon.redshift.jdbc42.Driver"
. local url "jdbc:redshift://redshift-cluster-1.cziajbxjzi3e.us-west-2.redshift.amazonaws.com:5439/emails"
. local user "admin"
. local pass "secret"

. jdbc add MyRed,  jar("`jar'") driverclass("`driverc'") url("`url'")
        user("`user'") password("`pass'")

We can now add the above commands to profile.do to save these connection settings in between Stata sessions, and we now can connect to our Redshift database by typing

. jdbc connect MyRed

To see what tables are availiable to load from our connection, we type

We can describe a table by typing

To load the data, we type

. jdbc load, table("response_info") clear
(128 observations loaded)

Now we have a Stata dataset and can perform our analysis!