Time and memory-efficient generation of huge spatio-temporal fields via conditional distributions #292

joshua-benabou · 2022-12-12T23:55:36Z

joshua-benabou
Dec 12, 2022

Hello,

I have been using GSTools to generate stationary Gaussian spatio-temproal 3D vector fields in 3 space+ 1 time dimension, following the discussion #251.

However, for e.g a 512^3 spatial grid and 20 points in time, I get an out-of-memory error when attempting to generate such a field F(x,t) with e.g

dim_x, dim_t  = 512, 20
cor_x, cor_t = 0.1, 0.1

model = gs.Gaussian(dim=4, var=1, len_scale=[cor_x, cor_x, cor_x, cor_t]) 
x, y, z, u = np.linspace(0,1,dim_x), np.linspace(0,1,dim_x), np.linspace(0,1,dim_x), np.linspace(0,1,dim_t) 

srf = gs.SRF(model, generator='IncomprRandMeth',  vec_dim=3) # incompressible vector field
F = srf((x, y, z, u), mesh_type='structured')

Since the comptuation scales cubically with the spatial grid sidelength , this should take on the order of hours without parrallelization (which would only give an improvement linear in the number of cores used). The field itself would occupy 64 Gb (my machine has more than that), but it is the intermediate computations GSTools which cause the out-of-memory, and exceed 64 Gb by at least a factor of two.

Am I missing some functionality of GSTools that would make this computation feasible, or was GSTools not designed to generate such large fields?

If not, I would I am curious if "conditional" field generation might reduce time and memory complexity. By this I mean: instead of drawing in one go the field at all dim_t time-points a dim_x**3 space points , one could first draw the field at time t1, then at t2 using the covariance matrix for the conditional distribution of the field F(t2) given F(t1). Since GSTools constructs the field in Fourier space, the cov matrix of the conditional distribution would probably need to be FFT'd. This is somewhat similar to kriging.

In that way, the covariance matrices used only involve two consecutive time-points, instead of all dim_t.

Has this been considered by the developpers previously? Would this be feasible to implement with the existing code structure? If yes, I would gladly have a go.

Thanks,

josh benabou

LSchueler · 2022-12-14T12:46:18Z

LSchueler
Dec 14, 2022
Maintainer

Hi,

yeah, using Python to write memory efficient code can be pretty challenging. If I find some time, I will have a look at the code, but it somehow seems reasonable to me, that GSTools needs about twice as much memory as an ideal calculation yields.

Did you have a look at this tutorial?
Without having a look at the equations I think this should also work with vector fields. - But I was wrong before with my intuition 😊

In case that does not work for whatever reason, you could also have a look at the chunk-code we used to have for the generation of the fields. Maybe that is a good starting point. The code below is copied from commit aa68b96 from 2018-12-06:

        # Test to see if enough memory is available.
        # In case there isn't, divide Fourier modes into 2 smaller chunks
        while True:
            try:
                chunk_no = 2 ** chunk_no_exp
                chunk_len = int(np.ceil(self._mode_no / chunk_no))
                if self.verbose:
                    print(
                        "RandMeth: Generating field with "
                        + str(chunk_no)
                        + " chunks"
                    )
                    print("(chunk length " + str(chunk_len) + ")")
                for chunk in range(chunk_no):
                    if self.verbose:
                        print(
                            "chunk " + str(chunk + 1) + " of " + str(chunk_no)
                        )
                    ch_start = chunk * chunk_len
                    # In case k[d,ch_start:ch_stop] with
                    # ch_stop >= len(k[d,:]) causes errors in
                    # numpy, use the commented min-function below
                    # ch_stop = min((chunk + 1) * chunk_len, self._mode_no-1)
                    ch_stop = (chunk + 1) * chunk_len

                    if self.dim == 1:
                        phase = self._cov_sample[0, ch_start:ch_stop] * x
                    elif self.dim == 2:
                        phase = (
                            self._cov_sample[0, ch_start:ch_stop] * x
                            + self._cov_sample[1, ch_start:ch_stop] * y
                        )
                    else:
                        phase = (
                            self._cov_sample[0, ch_start:ch_stop] * x
                            + self._cov_sample[1, ch_start:ch_stop] * y
                            + self._cov_sample[2, ch_start:ch_stop] * z
                        )
                    summed_modes += np.squeeze(
                        np.sum(
                            self._z_1[ch_start:ch_stop] * np.cos(phase)
                            + self._z_2[ch_start:ch_stop] * np.sin(phase),
                            axis=-1,
                        )
                    )
            except MemoryError:
                chunk_no_exp += 1
                print(
                    "Not enough memory. Dividing Fourier modes into {} "
                    "chunks.".format(2 ** chunk_no_exp)
                )
            else:
                # we break out of the endless loop if we don't get MemoryError
                break

But I really hope the simple merging of smaller fields works straight out of box. Please keep us updated on this :-)

0 replies

LSchueler · 2022-12-14T12:52:13Z

LSchueler
Dec 14, 2022
Maintainer

Actually, the code above was refactored a little bit, you should rather have a look at commit 8daaeb8:

def _chunk_it(self, pos, summed_modes, phase):
        """Calls the actual randomization method, but divides the mode axes
           into smaller chunks in case the memory runs full.

        Parameters
        ----------
        pos : :class:`list`
            a list containing the position arrays `x`, `y`, and `z` (in weird shapes)
        summed_modes : :class:`numpy.ndarray`
            the zero-initialised result array, the shape needs to be known
        phase : :class:`numpy.ndarray`
            the zero-initialised phase array, the shape needs to be known

        Returns
        -------
        summed_modes : :class:`numpy.ndarray`
            the summed_modes array chunk for a given memory amount
        """
        if not np.all(np.isclose(summed_modes, 0.0)):
            raise ValueError("summed_modes is not initialized with zeros.")
        if not np.all(np.isclose(phase, 0.0)):
            raise ValueError("phase is not initialized with zeros.")

        # make a guess fo the chunk_no according to the input
        tmp_pnt = np.prod(summed_modes.shape) * self._mode_no
        chunk_no_exp = int(
            max(0, np.ceil(np.log2(tmp_pnt / self.chunk_tmp_size)))
        )
        # Test to see if enough memory is available.
        # In case there isn't, divide Fourier modes into 2**n smaller chunks
        while True:
            try:
                chunk_no = 2 ** chunk_no_exp
                chunk_len = int(np.ceil(self._mode_no / chunk_no))
                if self.verbose:
                    print(
                        "RandMeth: Generating field with "
                        + str(chunk_no)
                        + " chunks"
                    )
                    print("(chunk length " + str(chunk_len) + ")")
                for chunk in range(chunk_no):
                    if self.verbose:
                        print(
                            "chunk " + str(chunk + 1) + " of " + str(chunk_no)
                        )
                    ch_start = chunk * chunk_len
                    # In case k[d,ch_start:ch_stop] with
                    # ch_stop >= len(k[d,:]) causes errors in
                    # numpy, use the commented min-function below
                    # ch_stop = min((chunk + 1) * chunk_len, self._mode_no-1)
                    ch_stop = (chunk + 1) * chunk_len

                    for d in range(self.model.dim):
                        phase[..., ch_start:ch_stop] += (
                            self._cov_sample[d, ch_start:ch_stop] * pos[d]
                        )

                    self._mode_summand(summed_modes, phase, ch_start, ch_stop)

            except MemoryError:
                chunk_no_exp += 1
                print(
                    "Not enough memory. Dividing Fourier modes into {} "
                    "chunks.".format(2 ** chunk_no_exp)
                )
            else:
                # we break out of the endless loop if we don't get MemoryError
                break
            return summed_modes

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time and memory-efficient generation of huge spatio-temporal fields via conditional distributions #292

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Time and memory-efficient generation of huge spatio-temporal fields via conditional distributions #292

joshua-benabou Dec 12, 2022

Replies: 2 comments

LSchueler Dec 14, 2022 Maintainer

LSchueler Dec 14, 2022 Maintainer

joshua-benabou
Dec 12, 2022

LSchueler
Dec 14, 2022
Maintainer

LSchueler
Dec 14, 2022
Maintainer