-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up and enhance reading from (and writing to) a GRASS GIS database #93
Conversation
vect() supports a missing layer argument, so supporting this too for GDAL-GRASS driver. Also, for efficiency reasons, this commit only requests layer names in case of polygons, to work around OSGeo/gdal-grass#46.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am deeply sceptical about this for the following reasons:
terra
must be built against a GDAL installation that knows about the plugin for its GDAL version and platform. GDAL cannot be built in this way unless all the GRASS libraries it needs also being built. This may work for source installs on Linux/Unix, but cannot work for MXE/Rtools for Windows or for macOS/recipes, because usingterra
orsf
or other R packages building against GDAL do not need that large extra burden. Using OSGEO4W will not work as MXE/Rtools build static for the chosen compilers using UCRT, but OSGEO4W has a mess of CRTs.- Previous experience even with the plugin was a nightmare, as the GDAL+plugin constantly got out of synch with other components. Since the plugin was dropped by GDAL, I haven't even tried to touch it. Any modifications would need very careful prior attention to all the build scenarios. Saying that the user has to sort these out is not going to work.
terra
doesn't read raster data at all, but rather sets up an environment in C++ code pointing to the data source layer, unless forced to thein.memory
setting. I really don't see any difference in the sense that when the"spatRaster"
isin.memory
,terra
shadows the memory by a temporary file anyway, so instead ofr.out.gdal
thenrast
setting up pointers to the file created byr.out.gdal
,terra
would likely set up pointers to the single GRASS raster, but create a shadow file itself should the data be needed in the R workspace.- The
vect
case is slightly different, becauseterra
always reads into a"spatVector"
object, but I'm unsure whether there is always a shadow file. - If 1 and 2 can be shown to be feasible within reasonable developer time constraints. careful tests would be needed to show that the inclusion of the unproven (read only) driver is robust and scales.
- Given more time, I could find even more reasons for not pursuing the plugin; data transfer speed really isn't important - the chosen raster format
"RRASTER"
is not compressed and is native forterra
, and was upgraded by me to supportWKT2-2019
.
Point 1 by @rsbivand is an important one regarding winGRASS. Different GDAL versions built by different compilers may interfere. That's a packaging issue. Though I find it an interesting approach worth to be followed. The fallbacks are the methods currently used? |
@hellik yes, if GDAL is not aware of the GDAL-GRASS driver (so indeed it needs to be set up correctly), the current implementation is used. In that case GRASS GIS is called with For a binary driver installation that works out of the box on Ubuntu-based systems, one can use the ubuntugis-unstable PPA, which is where the official GRASS GIS release lives, and install These are the driver issues encountered a few days ago and posted at the driver repo, with help of @neteler: OSGeo/gdal-grass#46, OSGeo/gdal-grass#47, OSGeo/gdal-grass#48. |
So also this is bringing back code for example at: https://r-forge.r-project.org/scm/viewvc.php/pkg/rgrass7/R/bin_link.R?root=spgrass#l330, dropped on transfer to github. It was dropped because effectively nobody used the plugin, and because it was read-only. Please try some comparisons writing objects from R to GRASS using the plugin, as I think you suggest that it can write. I understand that this will only work for users with |
Running |
No, it's only relevant to In relation to the driver, Apart from that, this PR also adds some
GDAL needs no specific configuration to use the driver; it uses its autoload mechanism to detect the driver and make it available. See https://github.com/OSGeo/gdal-grass/blob/main/README.md:
On my system it was sufficient for {terra} to have been compiled against the GDAL version that was in place before the additional driver was installed. Also no additional $ locate gdalplugins | grep GRASS
/usr/lib/x86_64-linux-gnu/gdalplugins/gdal_GRASS.so
/usr/lib/x86_64-linux-gnu/gdalplugins/ogr_GRASS.so |
Thanks. It's introduced by 682c7fc, 'needed' in order to not get the same error on my system, where there's no So since this error now pops up on your side I guess on your machine the The solution will be to be explicit about the mapset where new layers will be written. I suggest that we create a custom mapset explicitly in the examples, so that written layers end up there, not in |
Yes, @florisvdh creating a mapset to write into is a good idea in the examples; will you implement it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the PR and works nicely, thanks @florisvdh! I just needed to create an .Renviron
file with the GDAL_DRIVER_PATH="/usr/lib/gdalplugins"
variable in $HOME
and update terra
to the latest version.
In the course of this fiddling with mapsets, it was seen (by accident) that the mapset region settings are not respected by the GDAL driver. This only affects raster. This is a no-go for the GDAL-GRASS raster driver as long as it doesn't expose an option to respect the mapset's current region, or at least to set a region. A feature request will be made at the driver's repo, but as long as it's not implemented, we cannot consider using the driver for rasters since results with and without using the driver must of course have the same extent and resolution. For sake of completeness, this commit takes control of the region too, in order to guarantee that things always work. But this will be dropped from the examples in a next commit, followed by dropping driver-related code from read_RAST() and write_RAST().
It was seen (by accident) that the mapset region settings are not respected by the GDAL driver. This only affects raster. This is a no-go for the GDAL-GRASS raster driver as long as it doesn't expose an option to respect the mapset's current region, or at least to set a region. A feature request will be made at the driver's repo, but as long as it's not implemented, we cannot consider using the driver for rasters since results with and without using the driver must of course have the same extent and resolution. So next is dropping driver-related code from read_RAST() and write_RAST().
, 4389249) * In the course of this fiddling with mapsets, it was seen (by accident) that the mapset region settings are not respected by the GDAL driver. This only affects raster. This is a no-go for the GDAL-GRASS raster driver as long as it doesn't expose an option to respect the mapset's current region, or at least to set a region. A feature request has been made at the driver's repo (OSGeo/gdal-grass#49). As long as it's not implemented, we cannot consider using the driver for rasters since results with and without using the driver must of course have the same extent and resolution.
Thanks for testing @veroandreo ! |
@rsbivand Setting a mapset in the examples of this PR has now been taken care of (in I discovered that the |
While fiddling with mapsets and regions yesterday, it was discovered that the GDAL-GRASS driver does not respect a mapset's region. It just reads out a raster layer as it is stored in the GRASS GIS database. This is an important difference with As this causes inconsistency between the Consequently I have reverted the use of the GDAL-GRASS raster driver in this PR (a828acf). A corresponding driver feature request has been posted at OSGeo/gdal-grass#49, so at least awaiting a solution for that one before reconsideration. For the driver it makes sense to take into account the region, since this GDAL driver is tailored for GRASS GIS. Thanks @neteler and @veroandreo for the discussion about this yesterday. Note that for vector datasets the region is not considered in GRASS GIS, so only the raster driver approach has been reverted. |
@florisvdh looks OK now, would you like to merge? |
OK, let's do that. I'll add updates to |
@rsbivand you can of course rephrase |
This PR attempts to address the following:
read_RAST()
,read_VECT()
: implement the standalone GDAL-GRASS drivers by default. This avoids exporting a GRASS GIS layer to a temporary file before reading into R. As a result one will get higher reading rates; this effect will increase with the size of the layer. Notes:read_RAST()
this has only been implemented for the terra case, as it is the default.v.out.ogr
andr.out.gdal
pathways (writing a temporary file) remain in place when the necessary GDAL-GRASS driver is missing or if theuse_gdal_grass_driver
argument is set asFALSE
.write_RAST()
andwrite_VECT()
will prevent calls to GRASS GIS to import such a source, as this would not succeed.read_VECT()
: provide support for theSpatVectorProxy
class by adding aproxy
argument (seeterra::query()
to useSpatVectorProxy
in R).ignore.stderr
andSys_ignore.stdout
: implement them in the defaultexecGRASS()
approach (usessystem2()
) – they were only implemented in the legacy approach – and make sure both arguments are exposed byread_RAST()
andread_VECT()
.write_VECT()
: when theSpatVector
object refers to a source file, a temporary GeoPackage is no longer written and the file is used directly byv.in.ogr
. A similar shortcut was already implemented forwrite_RAST()
.Further testing, especially with Windows and macOS are most welcome!
A small benchmark is shown below.
Code to create the benchmark
Created on 2024-06-16 with reprex v2.1.0
Some system specs:
$ ogrinfo --version GDAL 3.8.4, released 2024/02/08 $ $ apt policy libgdal-grass libgdal-grass: Installed: 1:1.0.2-7+jammy3 Candidate: 1:1.0.2-7+jammy3 Version table: *** 1:1.0.2-7+jammy3 500 500 http://ppa.launchpad.net/ubuntugis/ubuntugis-unstable/ubuntu jammy/main amd64 Packages 100 /var/lib/dpkg/status 3.4.1-3 500 500 http://ftp.belnet.be/ubuntu jammy/universe amd64 Packages $ $ inxi -Sxxx System: Host: xxx Kernel: 5.15.0-112-generic x86_64 bits: 64 compiler: gcc v: 11.4.0 Desktop: Cinnamon 6.0.4 tk: GTK 3.24.33 wm: muffin vt: 7 dm: LightDM 1.30.0 Distro: Linux Mint 21.3 Virginia base: Ubuntu 22.04 jammy