Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection attempt to arcadedb (supports the Postgres wire format for connection and queries) fails with "received invalid response to GSSAPI negotiation: R" #363

Open
mskyttner opened this issue Dec 2, 2021 · 17 comments

Comments

@mskyttner
Copy link

Attempting to connect to arcadedb which claims to support the Postgres wire format for connections and queries, an error "received invalid response to GSSAPI negotiation: R" appears.

Brief description of the problem

It appears as GSSAPI is used by default, which throws the error, but R hangs when I try to disable this when making a connection.

The RJDBC library is able to make a connection and read data from a table using the latest official Postgres JDBC-driver.

arcadedb_cmd <- paste(
  "docker run -d -p '2480:2480' -p '2424:2424' -p '5432:5432'",
  "--env arcadedb.server.rootPassword=playwithdata",
  "--env 'arcadedb.server.defaultDatabases=Imported[root]{import:https://github.
com/ArcadeData/arcadedb-datasets/raw/main/orientdb/OpenBeer.gz}'", 
  "--env arcadedb.server.plugins='Postgres:com.arcadedb.postgres.PostgresProtoco
lPlugin'", 
  "arcadedata/arcadedb:latest"
)

cmd <- gsub("\n", "", arcadedb_cmd)

# launch the background service with arcadedb using docker
system(arcadedb_cmd)

# check that it is online using the API
arcadedb_is_up <- function() {
  
  res <- httr::content(
    httr::GET("http://localhost:2480/api/v1/databases", 
    httr::authenticate("root", "playwithdata"))
  )
  
  res$result[[1]] == "Imported"
}

stopifnot(arcadedb_is_up())

# use RJDBC and the pg jdbc driver

download.file(
  "https://jdbc.postgresql.org/download/postgresql-42.3.1.jar",
  "~/repos/arcadedb/postgresql-42.3.1.jar"
)

library(RJDBC)

drv <- JDBC(
  driverClass = "org.postgresql.Driver", 
  classPath = "~/repos/arcadedb/postgresql-42.3.1.jar", 
  identifier.quote = "`"
)

conn <- dbConnect(drv, 
  paste0("jdbc:postgresql://localhost/Imported",
    "?user=root&password=playwithdata&ssl=false&gssencmode=disable"))

tibble::as_tibble(dbGetQuery(conn, "select * from Brewery limit 10;"))

dbDisconnect(conn)

# Now, attempt to use RPostgres which uses libpg and no Java

library(RPostgres)
library(DBI)

#Sys.setenv(PGGSSENCMODE="disable")
Sys.unsetenv("PGGSSENCMODE")

con <- dbConnect(
  RPostgres::Postgres(),
#  gssencmode = 'disable',
  dbname = 'Imported',
  host = 'localhost',
  port = 5432, 
  user = 'root',
  password = 'playwithdata'
)

# Error: connection to server at "localhost" (127.0.0.1), port 5432 failed: received invalid response to GSSAPI negotiation: R

Related to #174 (but it seems I want to do the opposite and disable GSSAPI negotiation, and when trying to do so R hangs)

@krlmlr
Copy link
Member

krlmlr commented Dec 3, 2021

Thanks. When I try to start the Docker image on a machine with 2 GB, the SSH connection gets disconnected (!). When I try to limit the memory with ulimit -v to e.g. 512MB, I get an error. I also tried prepending ARCADEDB_OPTS_MEMORY="-Xms512m -Xmx512m" before docker run .

What are the memory requirements for this image?

@mskyttner
Copy link
Author

I tried looking for memory limits for arcadedb (and the earlier orientdb) and couldn't really find a good link/doc, but I saw some mentions in various places for 2, 4, 6 and 8 GB.

For what it is worth, I did not set any specific limits (which I think means that the docker image will use as much as it needs from available memory on the host) and my "docker stats" then reports 2.2 GB memory used by arcadedb when loaded with the data here (OpenBeer database).

I tried setting some environment variables to disable SSL (recommended by arcadedb team) and Kerberos when making the connection, but get a "hang" and also no indication of OOM on the two machines where I have tested this.

library(RPostgres)
library(DBI)

#Sys.setenv(PGGSSENCMODE="disable")
#Sys.unsetenv("PGGSSENCMODE")
#Sys.unsetenv("PGSSLMODE")
#Sys.setenv(PGSSLMODE="disable")

con <- dbConnect(
  RPostgres::Postgres(),
  #gssencmode = 'disable',
  #sslmode = "disable",
  dbname = 'Imported',
  host = 'localhost',
  port = 5432, 
  user = 'root',
  password = 'playwithdata'
)

@lvca
Copy link

lvca commented Dec 3, 2021

You can set in ArcadeDB's docker -Darcadedb.profile=low-ram to run ArcadeDB with <32M of RAM and see how it goes. In general, the RAM allocated for the JVM should be <=80% of the container RAM.

The default Dockerfile sets 2GB of RAM for ArcadeDB (-Xms2G -Xmx2G), so you should allocate around 2.3G to the Docker container.

To run ArcadeDB with 1G docker container, you could start ArcadeDB by using 800M for RAM:

docker .. -e ARCADEDB_OPTS_MEMORY="-Xms800M -Xmx800M" ...

@lvca
Copy link

lvca commented Dec 3, 2021

Added this paragraph hoping it helps a little more on this subject. Any suggestions to improve it are welcome: https://docs.arcadedb.com/#DockerTuning

@krlmlr
Copy link
Member

krlmlr commented Dec 4, 2021

Thanks. I have now added enough memory, I'm seeing now:

GGSSENCMODE=require psql -d Imported -h localhost -p 5432 -U root
# ... not supported

When running the R code, dbConnect() hangs. I'm running:

con <- DBI::dbConnect(
  RPostgres::Postgres(),
  dbname = 'Imported',
  host = 'localhost',
  port = 5432, 
  user = 'root',
  password = 'playwithdata'
)

The Docker container is also silent.

@krlmlr
Copy link
Member

krlmlr commented Dec 4, 2021

FWIW, once this is fixed, a good way to help us remain compatible is to add a job to the check matrix, see #368.

@lvca
Copy link

lvca commented Dec 5, 2021

@krlmlr you can add this setting to the container to print messages from postgres protocol: -e arcadedb.postgres.debug=true. Note, this has been just added so be sure to use the latest image.

Also, how can I reproduce the same on my side? I never used R before :-)

@krlmlr
Copy link
Member

krlmlr commented Dec 5, 2021

Thanks. To replicate from the R side, you can use the following minimal Dockerfile:

FROM rocker/r-rspm:20.04

RUN R -q -e 'install.packages("pak")'
RUN R -q -e 'pak::pak("RPostgres", dependencies = TRUE)'
RUN R -q -e 'pak::pak(c("pkgload", "pkgbuild"), dependencies = TRUE)'
RUN apt-get update && apt-get install -y git libpq-dev
RUN git clone https://github.com/r-dbi/RPostgres
# Build once
RUN cd RPostgres && R -q -e 'pkgload::load_all()'
# Run code
RUN cd RPostgres && R -q -e 'pkgload::load_all(); con <- dbConnect(Postgres())'

In an interactive Docker shell, use R to start a REPL.

@krlmlr
Copy link
Member

krlmlr commented Dec 5, 2021

I don't know much about arcadedb, happy to review a pull request if we can't fix it with configuration. And of course a new entry to the GitHub Actions check matrix!

@lvca
Copy link

lvca commented Jan 4, 2022

Thanks for your help. The issue was in ArcadeDB and in the way the protocol was managed.

@lvca
Copy link

lvca commented Jan 4, 2022

Tested with this script and works:

.libPaths( c( "/usr/local/lib/R/lib" , .libPaths() ) )

r = getOption("repos")
r["CRAN"] = "http://cran.us.r-project.org"
options(repos = r)

##install.packages(c("RPostgres"), lib="/usr/local/lib/R/lib")
##install.packages(c("DBI"), lib="/usr/local/lib/R/lib")

library(DBI)

#Sys.setenv(PGGSSENCMODE="disable")
#Sys.unsetenv("PGGSSENCMODE")
#Sys.unsetenv("PGSSLMODE")
#Sys.setenv(PGSSLMODE="disable")

con <- dbConnect(
  RPostgres::Postgres(),
  gssencmode = 'disable',
  sslmode = "disable",
  dbname = 'Beer',
  host = 'localhost',
  port = 5432,
  user = 'root',
  password = 'playwithdata'
)

print("EXECUTING QUERY")

res <- dbSendQuery(con, "select * from Brewery limit 10")

print("FETCHING RESULTS")

dbFetch(res)

print("CLEARING RESULTS")
dbClearResult(res)
dbDisconnect(con)

@krlmlr
Copy link
Member

krlmlr commented Jan 5, 2022

Thanks! What's the best way to set up ArcadeDB on GitHub Actions? Would you like to contribute an entry to our check matrix?

@lvca
Copy link

lvca commented Jan 5, 2022

@robfrank what's the best way to do that?

@mskyttner
Copy link
Author

mskyttner commented Jan 5, 2022

@lvca I tried pulling the latest image and these commands were successful:

# run arcadedb locally
docker run --rm \
	-p "2480:2480" \
	-p "2424:2424" \
	-p "5432:5432" \
	--env ARCADEDB_OPTS_MEMORY="-Xms3G -Xmx3G" \
	--env arcadedb.postgres.debug=true \
	--env arcadedb.server.rootPassword=playwithdata \
	--env "arcadedb.server.defaultDatabases=Imported[root]{import:https://github.com/ArcadeData/arcadedb-datasets/raw/main/orientdb/OpenBeer.gz}" \
	--env arcadedb.server.plugins="Postgres:com.arcadedb.postgres.PostgresProtocolPlugin" \
	arcadedata/arcadedb:latest

# install psql 
sudo apt install postgresql-client-common postgresql-client-14  # install psql

# query
PGGSSENCMODE=disable PGPASSWORD=playwithdata psql -d Imported -h localhost -p 5432 -U root -c 'select name from Beer limit 1' -o /dev/stdout

Omitting the PGGSSENCMODE environment variable I could again see received invalid response to GSSAPI negotiation: R

Using a Dockerfile for a more reproducible environment I tried this variant of what @krlmlr recommended earlier, it just adds psql:

FROM rocker/r-rspm:20.04

RUN R -q -e 'install.packages("pak")'
RUN R -q -e 'pak::pak("RPostgres", dependencies = TRUE)'
RUN R -q -e 'pak::pak(c("pkgload", "pkgbuild"), dependencies = TRUE)'

RUN apt-get update && apt-get install -y --no-install-recommends \
	git \
	libpq-dev \
	postgresql-client-common \
	postgresql-client-12 \
	tini

RUN git clone https://github.com/r-dbi/RPostgres && \
	cd RPostgres && R -q -e 'pkgload::load_all()'

WORKDIR RPostgres
ENTRYPOINT ["/usr/bin/tini", "-g", "--"]
CMD R

After building the image locally with docker build -t rcade . I get an interactive R prompt with docker run --rm -it rcade which can be used for testing the connection. Find two oneliners for using the image above to test a basic query against arcadedb below.

pgsql

Running pgsql seems to work (omittig the PGGSSENCMODE setting does not even cause any issues or warnings it seems):

# note the docker option below for using network=host in order to be able to access the arcadedb instance running on the host from inside the container
docker run --rm --network=host -it rcade bash -c "PGGSSENCMODE=disable PGPASSWORD=playwithdata psql -d Imported -h localhost -p 5432 -U root -c 'select name from Beer limit 1' -o /dev/stdout"

    name     
-------------
 Hocus Pocus
(1 row)

RPostgres

The RPostgres connection returns the same data (with some warnings) for the same simple test query:

docker run --network=host -it rcade R -e "library(DBI);library(RPostgres);con <- dbConnect(RPostgres::Postgres(), gssencmode = 'disable', sslmode = 'disable', dbname = 'Imported', host = 'localhost', port = 5432, user = 'root', password = 'playwithdata');dbGetQuery(con, 'select name from Beer limit 1')"

System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to create bus connection: Host is down
Warning messages:
1: In system("timedatectl", intern = TRUE) :
  running command 'timedatectl' had status 1
2: Invalid time zone 'UTC', falling back to local time.
Set the `timezone` argument to a valid time zone.
CCTZ: Unrecognized timezone of the input vector: "" 
3: In result_fetch(res@ptr, n = n) :
  Don't need to call dbFetch() for statements, only for queries
         name
1 Hocus Pocus

Actually, leaving out settings for gssencmode and sslmode also seems to "work" if we disregard those warnings (PID1, "host is down", SET statements vs query and dbFetch), if trying this:

docker run --network=host -it rcade R -e "library(DBI);library(RPostgres);con <- dbConnect(Postgres(), dbname = 'Imported', host = 'localhost', port = 5432, user = 'root', password = 'playwithdata');dbGetQuery(con, 'select name from Beer limit 1')"

Logs from arcadedb when RPostgres query runs

A couple of statements related to timezone setting etc seem may be the cause of those warnings?

2022-01-05 11:10:41.516 INFO  [PostgresNetworkExecutor] PSQL:-> request for password (R - 8b) (thread=51)
2022-01-05 11:10:41.516 INFO  [PostgresNetworkExecutor] PSQL:-> authentication ok (R - 8b) (thread=51)
2022-01-05 11:10:41.517 INFO  [PostgresNetworkExecutor] PSQL:-> backend key data (K - 12b) (thread=51)
2022-01-05 11:10:41.517 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 24b) (thread=51)
2022-01-05 11:10:41.517 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 25b) (thread=51)
2022-01-05 11:10:41.517 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 25b) (thread=51)
2022-01-05 11:10:41.517 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.517 INFO  [PostgresNetworkExecutor] PSQL: query -> set client_encoding to 'UTF-8' (thread=51)
2022-01-05 11:10:41.518 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.520 INFO  [PostgresNetworkExecutor] PSQL: query -> SET datestyle to 'iso, mdy' (thread=51)
2022-01-05 11:10:41.520 INFO  [PostgresNetworkExecutor] PSQL:-> row description (T - 6b) (thread=51)
2022-01-05 11:10:41.520 INFO  [PostgresNetworkExecutor] PSQL:-> command complete (C - 5b) (thread=51)
2022-01-05 11:10:41.520 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.521 INFO  [PostgresNetworkExecutor] PSQL: query -> SET TIMEZONE = 'UTC' (thread=51)
2022-01-05 11:10:41.521 INFO  [PostgresNetworkExecutor] PSQL:-> row description (T - 6b) (thread=51)
2022-01-05 11:10:41.522 INFO  [PostgresNetworkExecutor] PSQL:-> command complete (C - 5b) (thread=51)
2022-01-05 11:10:41.522 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.696 INFO  [PostgresNetworkExecutor] PSQL: query -> SELECT oid, typname FROM pg_type (thread=51)
2022-01-05 11:10:41.696 INFO  [PostgresNetworkExecutor] PSQL:-> row description (T - 6b) (thread=51)
2022-01-05 11:10:41.696 INFO  [PostgresNetworkExecutor] PSQL:-> command complete (C - 13b) (thread=51)
2022-01-05 11:10:41.696 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL: parse (portal=) -> select name from Beer limit 1 (params=0) (errorInTransaction=false thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL:-> parse complete (1 - 4b) (thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL: sync (thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL: describe '' type=S (errorInTransaction=false thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL:-> no data (n - 4b) (thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL: sync (thread=51)
2022-01-05 11:10:41.698 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)
2022-01-05 11:10:41.699 INFO  [PostgresNetworkExecutor] PSQL: bind (portal=) ->  (thread=51)
2022-01-05 11:10:41.699 INFO  [PostgresNetworkExecutor] PSQL:-> bind complete (2 - 4b) (thread=51)
2022-01-05 11:10:41.699 INFO  [PostgresNetworkExecutor] PSQL: describe '' type=P (errorInTransaction=false thread=51)
2022-01-05 11:10:41.699 INFO  [PostgresNetworkExecutor] PSQL:-> row description (T - 29b) (thread=51)
2022-01-05 11:10:41.700 INFO  [PostgresNetworkExecutor] PSQL: execute (portal=) (limit=0)-> select name from Beer limit 1 (thread=51)
2022-01-05 11:10:41.700 INFO  [PostgresNetworkExecutor] PSQL:-> 1 row data (22b) (thread=51)
2022-01-05 11:10:41.700 INFO  [PostgresNetworkExecutor] PSQL:-> command complete (C - 13b) (thread=51)
2022-01-05 11:10:41.700 INFO  [PostgresNetworkExecutor] PSQL: sync (thread=51)
2022-01-05 11:10:41.700 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=51)

@mskyttner
Copy link
Author

mskyttner commented Jan 5, 2022

Thanks! What's the best way to set up ArcadeDB on GitHub Actions? Would you like to contribute an entry to our check matrix?

I'm not sure about the best way, but here is a stab at an GHA which sets up ArcadeDB with example data and tests the connection using psql:

https://github.com/mskyttner/rcade/blob/master/.github/workflows/arcadedb-connection.yaml

@lvca
Copy link

lvca commented Jan 5, 2022

About the warnings above, all the SET commands are simply ignored (even though some settings could be used, as the timezone - I'm checking):

  1. What generates the warning 1: In system("timedatectl", intern = TRUE) : running command 'timedatectl' had status 1?

  2. Are you reading the column CCTZ from a specific query for '2: Invalid time zone 'UTC', falling back to local time. Set the timezone argument to a valid time zone. CCTZ: Unrecognized timezone of the input vector: ""' ?

  3. Is the warning 3: In result_fetch(res@ptr, n = n) : Don't need to call dbFetch() for statements, only for queries generated by the driver for a particular reason I can lookup?

@mskyttner
Copy link
Author

Hmmm... I think the log entries in the arcadedb log come from running this R code: dbGetQuery(con, 'select name from Beer limit 1'). I made an assumption here that I'm running a simple "select type" sql query (that is why I used dbGetQuery), but under the hood the log shows that there several and various queries and statements made, and I'm not sure whether those happen on the arcadedb side (some probably?) or on the RPostgres side (some maybe?). Maybe using dbExecute would then have been more appropriate, I'm not sure, but when trying it I got a arcadedb log entry "com.arcadedb.postgres.PostgresProtocolException: Error on sending error 'Error on executing query: Socket closed' to the client").

Just speculating here, better insights can probably be provided by @krlmlr - I'm just a humble user :) - but for the first point, the timedatectl (even if it is present in the container) might require systemd running which it isn't and probably couldn't be expected to at least not inside a container (I have seen that some containers in this community often use supervisord or s6-init instead when there are such needs)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants