Fortunately, the basic settings we input when installing postgres worked without issue in pgAdmin4. Troubleshooting configuration files, sockets, ports, identity files, and the other minutiae of modern network security is a very easy way to burn a lot of time.
Connecting to pgAdmin4
ssh
into the Pi from a terminal and ensure thatPermitTunnel = yes
in the file/etc/ssh/sshd_config
- Install and open pgAdmin4, then create a server connection:
- The server name is a label for inside pgAdmin4, so we can set it to just about anything. Set the colors if you so choose, but they don’t play very well with dark themes.
- We’re connecting through
ssh
, so we’ll be using the port we set in/var/lib/postgresql/data/postgresql.conf
, which is 5432 by default.localhost
is the default hostname in Alpine’ssetup-hostname
script, andpostgres
is the default database username we created when setting up postgres. There are options to enter and save the password when we first set up the connection.
- We’re not using SSL for our connection, so this tab is blank.
- We are connecting via
ssh
, so choose Yes. The tunnel host is the Pi’s local IP address, which is a temporary address (called a DHCP lease) assigned by the router that usually starts with 192.168. The port, 22 by default, is set in/etc/ssh/sshd_config
on Alpine. The user name is postgres. There are options to enter and save the password when we first set up the connection.
- We don’t have any advanced settings to change, the 10 second setting auto connection attempt timeout is a sensible default so we’ll leave it.
- And that’s it. Once we click Save, we enter the password for
postgres
and our connection is live.
Connecting to RStudio
We’ll be using the RStudio database guide and this blog post as a reference and for additional context.
- Set a password for the postgres database on our RPi:
- Open a terminal
ssh postgres@192.168.xxx.xxx
psql
ALTER USER postgres PASSWORD 'my_new_secure_password';
\q
- On the local system, install some packages that are necessary for RStudio packages that we will install:
sudo apt install libpq-dev unixodbc unixodbc-dev odbc-postgresql
- From the RStudio console, install the relevant packages and confirm the needed drivers are available:
install.packages("DBI")
install.packages("odbc")
install.packages("RPostgres")
library(odbc)
sort(unique(odbcListDrivers()[[1]]))
- Create the connection. I recommend saving the command in an R script. If you use
odbc
, RStudio saves the details in a dedicated Connections tab, but it’s also good to have it available just in case.
library(DBI)
library(odbc)
library(RPostgres)
# Using ODBC
con <- DBI::dbConnect(odbc::odbc(),
driver = "PostgreSQL Unicode",
server = "192.168.xxx.xxx",
host = "localhost",
database = "postgres",
UID = rstudioapi::askForPassword("Input user name"),
PWD = rstudioapi::askForPassword("Input password"),
port = 5432)
# Using RPostgres package
con <- DBI::dbConnect(RPostgres::Postgres(),
dbname = "postgres",
host = "192.168.xxx.xxx",
port = 5432,
user = rstudioapi::askForPassword("Input database user name"),
password = rstudioapi::askForPassword("Input database user password")
)
# Disconnect when finished
DBI::dbDisconnect()
# Clear environment
rm(list = ls())
# Clear packages
pacman::p_unload(all)
How not to run queries
The database guide lists three methods for running SQL queries against a database from within RStudio: DBI, dplyr/dbplyr, and raw SQL within R Notebooks.
As I am interested in working with SQL more directly rather than converting R code with dplyr
/dbplyr
, it seemed like it would make sense to do the following within an R Markdown document:
- Write the database connection in one code chunk
- Run the SQL code, referencing the connection, as a separate chunk
- Do the analysis and manipulation
After some experimentation, I am convinced that this was a bad idea. In addition to the actual analysis, I am using this Rmd document with the blogdown
package (which runs on top of the hugo
static site generator to create a blog post), and funny things happened when I embedded a database connection in the source document.
Specifically, RStudio crashed a number of times, and even before that it seemed quite unhappy with what I was asking it to do. It appeared to be running an infinite loop, and when I tried to stop the code execution, the session became unresponsive.
Even after restarting RStudio and deleting the offending code chunks from my Rmd document, it would not knit properly or let me preview my changes until I deleted the previously-generated index.html file.
I’m not sure exactly what was wrong, but if I had to hazard a guess, I suspect that odbc
/RPostgres
was trying to keep the connection alive while blogdown
was trying to finish the job, so gears started grinding.
Clearly, this is not the right way. While I was breaking things, I did manage to run a test query successfully:
SELECT datname FROM pg_database;
For the time being, I believe I have gotten a little ahead of myself, so I’m going to call it a day and spend some time studying the nexus between databases and RStudio. I did manage to get a basic connection working and successfully queried my database twice, so I lumberingly met my goal.