Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination using limit and offset query parameters #5

Open
peterhartman-90poe opened this issue Oct 24, 2023 · 14 comments
Open

Pagination using limit and offset query parameters #5

peterhartman-90poe opened this issue Oct 24, 2023 · 14 comments

Comments

@peterhartman-90poe
Copy link

peterhartman-90poe commented Oct 24, 2023

I have an API that achieves pagination using query parameters in the URL. The maximum items returned is 50 so an offset/skip is needed to retrieve the full dataset

Is this possible using the library as-is? I have hacked it for now by adding a "skip" parameter to the resource that gets passed as a filter

read(:list_companies) do
    argument :skip, :integer do
        allow_nil? true
        default 0
    end

    filter expr(skip == ^arg(:skip))
end

The limit is working nicely as-is:

limit_with {:param, "limit"}
@zachdaniel
Copy link
Collaborator

You should be able to do this with the "paginator" logic.

You would define an AshJsonApiWrapper.Paginator, and then use that in an endpoint, i.e paginator YourPaginator. The idea is that you'd specify offset and limit as normal and it would page through until it had as many as you've requested. Then, you could do something like this in the action:

read(:list_companies) do
  pagination do
    offset? true
    required? true
    default_limit 50
  end
end

And then you could use Ash's builtin pagination which would map back to the API's pagination.

@peterhartman-90poe
Copy link
Author

Unfortunately in the process of trying to build my own paginator I keep getting

** (KeyError) key :runtime_sort? not found in: nil

Attached is a livebook that recreates the problem:
paginate-20231024.livemd.zip

I have hit a wall trying to debug this myself unfortunately. Here is the output of dbg() around sort() in the datalayer

[(ash_json_api_wrapper 0.1.0) lib/data_layer/data_layer.ex:323: AshJsonApiWrapper.DataLayer.sort/3]
query #=> %AshJsonApiWrapper.DataLayer.Query{
  api: Api,
  context: %{
    private: %{authorize?: false, in_before_action?: true},
    action: %Ash.Resource.Actions.Read{
      arguments: [],
      description: nil,
      filter: nil,
      get_by: [],
      get?: false,
      manual: nil,
      metadata: [],
      modify_query: nil,
      name: :list_users,
      pagination: %Ash.Resource.Actions.Read.Pagination{
        default_limit: 50,
        max_page_size: 250,
        countable: false,
        required?: true,
        keyset?: false,
        offset?: true
      },
      preparations: [],
      primary?: true,
      touches_resources: [],
      transaction?: false,
      type: :read
    },
    authorize?: false,
    actor: nil,
    query_opts: [
      verbose?: false,
      actor: nil,
      authorize?: false,
      page: nil,
      return_query?: false
    ],
    page_opts: [limit: 50],
    initial_query: #Ash.Query<resource: Users, sort: [id: :asc], limit: 251>,
    filter_requests: [],
    initial_limit: nil,
    initial_offset: 0
  },
  headers: [],
  action: %Ash.Resource.Actions.Read{
    arguments: [],
    description: nil,
    filter: nil,
    get_by: [],
    get?: false,
    manual: nil,
    metadata: [],
    modify_query: nil,
    name: :list_users,
    pagination: %Ash.Resource.Actions.Read.Pagination{
      default_limit: 50,
      max_page_size: 250,
      countable: false,
      required?: true,
      keyset?: false,
      offset?: true
    },
    preparations: [],
    primary?: true,
    touches_resources: [],
    transaction?: false,
    type: :read
  },
  limit: nil,
  offset: 0,
  filter: nil,
  runtime_filter: nil,
  path: "https://65383945a543859d1bb1528e.mockapi.io/api/v1",
  query_params: %{},
  body: nil,
  sort: nil,
  endpoint: nil,
  templates: nil,
  override_results: nil
}

@zachdaniel
Copy link
Collaborator

I haven't looked into it too deeply, but I believe what we need to do is set an endpoint on the query in the set_context callback. Something like this:

endpoint: AshJsonApiWrapper.DataLayer.Info.endpoint(resource, action.name),

@zachdaniel
Copy link
Collaborator

The sort callback is expecting that to be there, but nothing is setting it into the query currently.

@peterhartman
Copy link
Contributor

peterhartman commented Oct 26, 2023

I think I have resolved :runtime_sort? not found in: nil issue with #6

I am still forced to add :runtime_sort? true to the endpoint if I add pagination. Is that expected?
Without it I get * Sorting is not supported

It will mean that I always make a request for all pages even if the limit is much lower which will be expensive.

@zachdaniel
Copy link
Collaborator

Hmm...It depends on if we've added support for mapping sort parameters or not. If we have, then we can do as much of the sorting on the endpoint as possible. If not we'll need to add that first.

@peterhartman
Copy link
Contributor

To check my understanding, Ash itself is forcing sorting (of some kind) when you paginate to ensure the same record doesn't appear in 2 different pages? eg because data is added in between calls to the data source

@zachdaniel
Copy link
Collaborator

yep. I think we could potentially eliminate that requirement though. Sorting while paginating is only required in certain contexts, and technically limit/offset pagination does not require sorting. It just leads to potentially inconsistent results later based on arbitrary changes, instead of something reasonable.

@zachdaniel
Copy link
Collaborator

This logic:

  defp do_paginate(query, pagination, opts) do
    # We want to make 100% sure that there is a stable sort at the end
    # of the sort for pagination
    query =
      if Ash.Actions.Sort.sorting_on_identity?(query) do
        query
      else
        Ash.Query.sort(query, Ash.Resource.Info.primary_key(query.resource))
      end

    paginated =
      cond do
        opts[:page][:before] || opts[:page][:after] ->
          keyset_pagination(query, pagination, opts[:page])

        opts[:page][:offset] ->
          limit_offset_pagination(query, pagination, opts[:page])

        pagination.offset? && pagination.keyset? ->
          keyset_pagination(query, pagination, opts[:page])

        pagination.offset? ->
          limit_offset_pagination(query, pagination, opts[:page])

        true ->
          keyset_pagination(query, pagination, opts[:page])
      end

    case paginated do
      {:ok, initial_query, query} ->
        if opts[:page][:filter] do
          {:ok, Ash.Query.filter(initial_query, ^opts[:page][:filter]),
           Ash.Query.filter(query, ^opts[:page][:filter])}
        else
          {:ok, initial_query, query}
        end

      {:error, error} ->
        {:error, error}
    end
  end

in Ash.Actions.Read is where we are doing this. My first thought was to ask the data layer if it supports sorting, but in this case it does support sorting, it just has a significant cost 😆. We could ask the data layer if "sorting on primary key is cheap" like it is with postgres? Something like Ash.DataLayer.can?(resource, :cheap_primary_key_sort? ). We'd need to add that capability to each existing data layer though...

@peterhartman
Copy link
Contributor

I think we also need a new callback for the paginator behaviour that allows the initial page to be requested, something like:

defmodule AshJsonApiWrapper.Paginator do
  @moduledoc """
  Behavior for scanning pages of a paginated endpoint.
  """

  @type ref :: {module, Keyword.t()}

  defmacro __using__(_) do
    quote do
      @behaviour AshJsonApiWrapper.Paginator
    end
  end

# vvvvvv
  @callback start(
              opts :: Keyword.t()
            ) :: {:ok, %{optional(:params) => map, optional(:headers) => map}}
# ^^^^^^

  @callback continue(
              response :: term,
              entities :: [Ash.Resource.record()],
              opts :: Keyword.t()
            ) :: {:ok, %{optional(:params) => map, optional(:headers) => map}} | :halt
end

@zachdaniel
Copy link
Collaborator

Yeah, that makes sense 👍

@peterhartman
Copy link
Contributor

peterhartman commented Oct 29, 2023

While testing I would expect the following to pass:

users =
      Users
      |> Ash.Query.for_read(:list_users)
      |> Ash.Query.limit(2)
      |> Api.read!()

users2 =
      Users
      |> Ash.Query.for_read(:list_users)
      |> Api.read!(page: [limit: 2, offset: 1])

users_count = users.results |> Enum.count()
users2_count = users2.results |> Enum.count()

assert(users_count == users2_count)

but it fails (possibly/probably be cause of this: https://github.com/ash-project/ash/blob/780eae8d69b750992515470d95003ad76c540956/lib/ash/actions/read.ex#L2778C1-L2779C1)

Assertion with == failed
     code:  assert users_count == users2_count
     left:  3
     right: 2

Why do we add one to the limit in the first case?

peterhartman added a commit to peterhartman/ash_json_api_wrapper that referenced this issue Oct 29, 2023
@zachdaniel
Copy link
Collaborator

🤔 something is interesting there. So we do add one to the limit when paginating so we can tell you if there are more results. But we should also remove that extra one.

@zachdaniel
Copy link
Collaborator

Okay, I believe I've fixed the issue you found in core where we are including the extra result in the set. We still need to do something to remove the requirement on sorting for pagination on a case by case basis I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants