Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add limit and metadata_version parameters to the settings v2 endpoint #830

Closed
samkanga opened this issue May 17, 2021 · 32 comments · Fixed by #838
Closed

Add limit and metadata_version parameters to the settings v2 endpoint #830

samkanga opened this issue May 17, 2021 · 32 comments · Fixed by #838

Comments

@samkanga
Copy link

The requirement is to be able to

  • Pull all settings from the api into the warehouse
  • Get any new records, updates and deletions when they happen (preferably by checking for a newer/higher serverVersion like in the other endpoints)

So in order to implement the above, the changes we'd like are

  1. The returned results need to be paginated - currently the https://<opensrp-server>/rest/v2/settings/?serverVersion=0 endpoint returns all results upto 1000. We would need a way to access the records after the first 1000. Previously this has been done by using the limit parameter together with the serverVersions. So for example fetch the first 100 with serverVersion=0&limit=100 and the next 100 with serverVersion=100&limit=100 ...etc
  2. Add an identifier parameter that will be incrementing - Like in my example above, this would be what the serverVersion is. I'm not sure how serverVersions are currently being assigned but there are many duplicates both hitting the endpoint with just the serverVersion and even when filtering with a specific settings identifier. I noticed we have the settingMetadataId which seems to be a good candidate for this, can you confirm that it is always incremented when we have new records/edits/deletes? and if yes can we add it as a parameter and use it to pull records that way?
@rehammuzzamil
Copy link
Contributor

Based on the quick investigation:

  1. We need to provide pagination support in the /rest/v2/settings GET end-point. Wrt implementation, new fields will be added in SettingSearchBean object, i.e. pageNumber, pageSize, offset, limit
    Based on the input parameters provided by pageNumber and pageSize request params, offset and limit will be calculated and passed to the query to perform Row-bound pagination.

Moreover, If the page number param is not provided, all the records will be returned in one go to ensure backward compatibility.
Otherwise, API needs to be triggered for the next page every time.

  1. Based on my understanding, on the update method; records are updated via repository method "updateByPrimaryKeyAndGenerateServerVersion" in which serverVersion is set to the nextVal of 'core.setting_server_version_seq' which means it works on the principle of increment.

While settings metadata is created from the settings object and values are updated based on the settings key and UUID. It inserts new records in the metadata table if the records are not already there else it updates the records on the same entries.
@bennsimon @dubdabasoduba can you please confirm point no. 2?
cc @samkanga

@bennsimon
Copy link
Member

bennsimon commented May 17, 2021

@rehammuzzamil yeah on update the serverVersion is incremented by adding one to the the current setting_server_version_seq sequence.

Duplicates are occurring since on the setting_metadata table, same server_version is used for one all the settings under a particular identifier.

i.e

            identifier             | server_version 
-----------------------------------+----------------
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 service_point_types               |              5
 service_point_types               |              5
 service_point_types               |              5
 service_point_types               |              5

@dubdabasoduba Was this the intention?

@Wambere
Copy link

Wambere commented May 17, 2021

A little clarification, the connector needs to be able to pull all settings from the api, and be able to pull any updates/edits after. The plan as of right now is

  • to make the first request https://<opensrp-server>/rest/v2/settings/?settingMetadataId=0&limit=1000 which will return the first 1000 records
  • and any other subsequent request would be changing the settingMetadataId to be the highest value that we have in the warehouse, which would return the next 1000 records (with the value of the settingMetadataId higher than what was passed) and so forth

(this is with the assumption that any updates and new records are assigned a new settingMetadataId which is incremental)

so generally I think we only need to add these 2 parameters to that endpoint.

@Wambere Wambere changed the title Add OpenSRP settings support to Beam Add limit and settingMetadataId parameters to the settings endpoint May 17, 2021
@rehammuzzamil
Copy link
Contributor

Hi @Wambere, I believe settingMetadataId might not be the correct field for this purpose, as the settingMetadataId remains the same in case of an update of the existing record. The data is overridden on the same row, hence settingMetadataId is not changed.

In the case of serverVersion, as @bennsimon mentioned above, all the rows of setting_metadata are using the same server_version.
@dubdabasoduba Please share your thoughts on this.

@dubdabasoduba
Copy link
Member

@rehammuzzamil yeah on update the serverVersion is incremented by adding one to the the current setting_server_version_seq sequence.

Duplicates are occurring since on the setting_metadata table, same server_version is used for one all the settings under a particular identifier.

i.e

            identifier             | server_version 
-----------------------------------+----------------
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 inventory_unicef_sections         |              3
 service_point_types               |              5
 service_point_types               |              5
 service_point_types               |              5
 service_point_types               |              5

@dubdabasoduba Was this the intention?

The reason for this is to maintain compatibility with the v1 of this endpoint. If you updated the serverVersion for one metadata and performed a query using the old v1 endpoint then you would only get the new/updated metadata. The reset won't appear even though they are part of the settings. The mobile apps don't usually use serverVersion = 0 hence the issue.

@dubdabasoduba
Copy link
Member

dubdabasoduba commented May 17, 2021

Hi @Wambere, I believe settingMetadataId might not be the correct field for this purpose, as the settingMetadataId remains the same in case of an update of the existing record. The data is overridden on the same row, hence settingMetadataId is not changed.

In the case of serverVersion, as @bennsimon mentioned above, all the rows of setting_metadata are using the same server_version.
@dubdabasoduba Please share your thoughts on this.

@Wambere I think the edge case @rehammuzzamil is pointing out here is that the data warehouse might have a setting with metadata id e.g 5 not updated since the warehouse will assume the setting was already fetched. We need a way to tell the warehouse these settings were updated.

The solution I have in mind doesn't help @Wambere & team much considering it will do something they already do and are trying to avoid. Nway my thinking is we have a separate table (on the ware house) that stores the latest serverVersion for each setting identifier. Then when fetching using the settingsMetadataId we check if the serverVersion is the same or greater than what we have in the new table for that identifier. if not then we update the whole settings set.

@bennsimon do you think this would work if we moved the same suggestion to OpenSRP instead. Add a serverVersion column to the settings table and always use that as the single source of truth. so if a metadata is updated then the whole setting payload has the same serverVersion updated. This means we remove the serverVersion column from the metadata table.

@bennsimon
Copy link
Member

@dubdabasoduba Why not update the behaviour of the serverVersion in the setting_metadata table to follow convention of other tables on opensrp? i.e get incremented each time a record added or updated.

There is a serverVersion column already on the settings table.

@dubdabasoduba
Copy link
Member

@dubdabasoduba Why not update the behaviour of the serverVersion in the setting_metadata table to follow convention of other tables on opensrp? i.e get incremented each time a record added or updated.

There is a serverVersion column already on the settings table.

It does get incremented. But it increments all the metadata related to the identifier of the updated setting metadata.
The settings jsonb column also has a serverVersion entry. This also bring up some confusion. Hence my reasoning where we remove the column from the metadata table. Then also make sure that the new column on the settings table is always updated incase an update is made either via v1 or v2.

@rehammuzzamil
Copy link
Contributor

@dubdabasoduba Why not update the behaviour of the serverVersion in the setting_metadata table to follow convention of other tables on opensrp? i.e get incremented each time a record added or updated.
There is a serverVersion column already on the settings table.

It does get incremented. But it increments all the metadata related to the identifier of the updated setting metadata.
The settings jsonb column also has a serverVersion entry. This also bring up some confusion. Hence my reasoning where we remove the column from the metadata table. Then also make sure that the new column on the settings table is always updated incase an update is made either via v1 or v2.

Thanks @dubdabasoduba for your clarification. I have ambiguity over here, we still have the same server_version against a particular identifier on all the metadata rows. And we can do the same for the column in the settings table. How can the ETL team get only the updated/newly added records only?

@dubdabasoduba
Copy link
Member

@bennsimon Please add the suggestion you explain to me here. I think we can go with that.

@bennsimon
Copy link
Member

@dubdabasoduba Am thinking we can add a new column say metadata_version that will behave as the conventional server_version on other opensrp tables since changing the behaviour or deleting the column will have effects on v1 or v2 endpoints.

@rehammuzzamil
Copy link
Contributor

Based on the discussion above, the solution will be to add a new column named metadata_version in the settings_metadata table. It will serve the purpose of conventional server_version which will be incremented only in the case of update/add against a particular row/record only. I believe the warehouse team can then pass metadata_version and limit param to get the newly added and updated records by passing the highest value of metadata_version currently available in the data warehouse.

Does it sound okay? @Wambere @dubdabasoduba @bennsimon
Please let me know if there is anything incorrect. Thanks!
cc : @samkanga

@dubdabasoduba
Copy link
Member

Based on the discussion above, the solution will be to add a new column named metadata_version in the settings_metadata table. It will serve the purpose of conventional server_version which will be incremented only in the case of update/add against a particular row/record only. I believe the warehouse team can then pass metadata_version and limit param to get the newly added and updated records by passing the highest value of metadata_version currently available in the data warehouse.

Does it sound okay? @Wambere @dubdabasoduba @bennsimon
Please let me know if there is anything incorrect. Thanks!
cc : @samkanga

I think that's okay. Though we really need to test that against the v1 & v2 endpoints to make sure we maintain the original functionality.

@Wambere
Copy link

Wambere commented Jun 2, 2021

any update on this?

@rehammuzzamil
Copy link
Contributor

any update on this?

@Wambere PRs are under review. The coverage part is remaining, hoping to get it merged by Friday.

@rehammuzzamil
Copy link
Contributor

@Wambere Implementation is completed and is a part of https://github.com/opensrp/opensrp-server-web/releases/tag/v2.8.6-SNAPSHOT.

For your reference:

  • The newly added request parameters are metadata_version and limit. Both parameters are optional.
  • If the limit param is not provided then DEFAULT_FETCH_SIZE is used i.e. 1000
  • If the metadata_version field is provided then records are fetched with the criteria of metadata version greater than the provided one.

cc: @samkanga @bennsimon @dubdabasoduba

@Wambere
Copy link

Wambere commented Jun 11, 2021

@rehammuzzamil it looks like the new column metadata_version is not one of the fields returned from the endpoint? or is it named something different?
Without it I still would not know what value to use to make subsequent requests for next batches.

@rehammuzzamil
Copy link
Contributor

Hi @Wambere , Here is the sample response of Settings V1:

[
    {
        "identifier": "jurisdiction_metadata-risk",
        "settings": [
            {
                "settingMetadataId": "494",
                "serverVersion": 0,
                "description": "Jurisdiction Metadata for Nyaluwiro id 2980",
                "label": "Nyaluwiro metadata",
                "type": "Setting",
                "value": "50",
                "uuid": "6cfe7845-e6a8-45fc-a1fc-75cb7433b576",
                "metadataVersion": 450,
                "key": "2980",
                "settingIdentifier": "jurisdiction_metadata-risk"
            },
            {
                "settingMetadataId": "495",
                "serverVersion": 0,
                "description": "Jurisdiction Metadata for KSB_25 id 3087",
                "label": "KSB_25 metadata",
                "type": "Setting",
                "value": "21",
                "uuid": "2990c48f-2d21-4144-8fdd-d60a2065fc5f",
                "metadataVersion": 451,
                "key": "3087",
                "settingIdentifier": "jurisdiction_metadata-risk"
            },
...
}
...
]

Here is the sample response for settings v2:

[
    {
        "key": "5e4fa3cd-4910-49ca-b17c-07cab16653e3",
        "label": "IIVIYONGO metadata metadata",
        "description": "Jurisdiction Metadata for IIVIYONGO metadata id 5e4fa3cd-4910-49ca-b17c-07cab16653e3",
        "uuid": "e8102f78-df5e-4d87-8d14-553c201e4094",
        "settingsId": "72",
        "settingIdentifier": "jurisdiction_metadata-target",
        "settingMetadataId": "7068",
        "v1Settings": false,
        "resolveSettings": false,
        "documentId": "31956568-1f96-4939-9129-70dd8ba6dc02",
        "metadataVersion": 550,
        "serverVersion": 1598545480186,
        "type": "Setting"
    },
    {
        "key": "e3a1eac5-61e3-4e6d-b8b2-cbe9417a6ebd",
        "value": "1882",
        "label": " OTALANAWA metadata metadata metadata",
        "description": "Jurisdiction Metadata for  OTALANAWA metadata metadata id e3a1eac5-61e3-4e6d-b8b2-cbe9417a6ebd",
        "uuid": "ec8e6587-b394-4113-9bab-07b3f281a3da",
        "settingsId": "72",
        "settingIdentifier": "jurisdiction_metadata-target",
        "settingMetadataId": "6594",
        "v1Settings": false,
        "resolveSettings": false,
        "documentId": "31956568-1f96-4939-9129-70dd8ba6dc02",
        "metadataVersion": 551,
        "serverVersion": 1598545480186,
        "type": "Setting"
    },
...
]

You can get find a field named as metadataVersion in both the sample response. Please let me know if you need anything else. Thanks!

@Wambere
Copy link

Wambere commented Jun 11, 2021

@rehammuzzamil can you please share request url as well?

@rehammuzzamil
Copy link
Contributor

http://localhost:8080/opensrp/rest/v2/settings/?limit=2&metadata_version=1 @Wambere

  • You have to pass request parameters as metadata_version and limit
  • You will find metadataVersion field in the response

@rehammuzzamil rehammuzzamil changed the title Add limit and settingMetadataId parameters to the settings endpoint Add limit and metadata_version parameters to the settings v2 endpoint Jun 16, 2021
@Wambere
Copy link

Wambere commented Jun 21, 2021

Re-opening this issue as the endpoint does not work as expected

1. The response needs to be monotonic.
For example (1, 2, 3, 13, 14, 27) is acceptable, but (1, 3, 2, 27, 14) is not acceptable. Currently when a record is updated, both the serverVersion and the metadataVersion change, but the order of the records remains the same. So for example if you had records from 1 to 5, then you edit 1, it now becomes 6, but it is still at the top of the list.

2. Should all serverVersions be changing?
From previous comments it seems like when 1 record is updated in a settings group, then all the records in that group get a new serverVersion. This does not seem to be the case. You can checkout our test instance here with serverVersion param or here with metadata_version param . I updated 2 different records within groups and the serverVersion changed for only the updated records, and not the whole group

@Wambere Wambere reopened this Jun 21, 2021
@rehammuzzamil
Copy link
Contributor

Hi @Wambere, As per the initial investigation:

  1. I have replicated the scenario on my local, and as per my observation metadata_version field works correctly i.e. it is getting updated whenever a new record is added or updated. We might need to add an order by clause to ensure that records are returned in the order of ascending metadata_version.
  2. I don't have creds to access the test instance. But as per my observation whenever we create/update a record using settings v1, the same serverVersion is used against all the records of the same group (identifier). Can you please share the payload with us to further test this flow? @dubdabasoduba any thoughts on this?

@dubdabasoduba
Copy link
Member

Hi @Wambere, As per the initial investigation:

  1. I have replicated the scenario on my local, and as per my observation metadata_version field works correctly i.e. it is getting updated whenever a new record is added or updated. We might need to add an order by clause to ensure that records are returned in the order of ascending metadata_version.
  2. I don't have creds to access the test instance. But as per my observation whenever we create/update a record using settings v1, the same serverVersion is used against all the records of the same group (identifier). Can you please share the payload with us to further test this flow? @dubdabasoduba any thoughts on this?

@rehammuzzamil I am okay with adding the pagination. Just add it as something configurable since not all instances require the settings order by metadata_version

Could also test that the server_version is updated for all settings metadata when using the v2 endpoint.

@Wambere
Copy link

Wambere commented Jun 23, 2021

@rehammuzzamil I'm creating and updating by POSTing to https://beam-etl.labs.smartregister.org/opensrp/rest/settings/sync with payload

{
    "settingConfigurations": [
        {
            "_id": "{{insert settingsId}}",
            "identifier": "{{insert settingIdentifier}}",
            "settings": [
                {
                    "description": "{{insert description}}",
                    "key": "{{insert key}}",
                    "label": "{{insert label}}",
                    "uuid": "{{insert uuid}}",
                    "value": "{{insert value}}"
                }
            ],
            "type": "SettingConfiguration"
        }
    ]
}

As for ordering, if we could get order by both serverVersion and metadataVersion that would be great!

@rehammuzzamil
Copy link
Contributor

rehammuzzamil commented Jun 24, 2021

Re-opening this issue as the endpoint does not work as expected

1. The response needs to be monotonic.
For example (1, 2, 3, 13, 14, 27) is acceptable, but (1, 3, 2, 27, 14) is not acceptable. Currently when a record is updated, both the serverVersion and the metadataVersion change, but the order of the records remains the same. So for example if you had records from 1 to 5, then you edit 1, it now becomes 6, but it is still at the top of the list.

2. Should all serverVersions be changing?
From previous comments it seems like when 1 record is updated in a settings group, then all the records in that group get a new serverVersion. This does not seem to be the case. You can checkout our test instance here with serverVersion param or here with metadata_version param . I updated 2 different records within groups and the serverVersion changed for only the updated records, and not the whole group

I have tested the flow and figured out that whenever a single record is being updated using v1, the server version is updated against only a particular record.
Here is the sample response for V1:

[
    {
        "identifier": "test-identifier",
        "settings": [
            {
                "settingMetadataId": "7639",
                "serverVersion": 0,
                "description": "test-desc-3",
                "label": "test-label-3",
                "type": "Setting",
                "value": "test-value-3",
                "uuid": "31111-7e6c-5c16-8cb3-8d1e35d97f50-1",
                "metadataVersion": 1046,
                "key": "test-key-3",
                "settingIdentifier": "test-identifier"
            },
            ...
            {
                "settingMetadataId": "7637",
                "serverVersion": 0,
                "description": "test-desc-1",
                "label": "test-label-1",
                "type": "Setting",
                "value": "test-value-1-update-2ndtime",
                "uuid": "11111-7e6c-5c16-8cb3-8d1e35d97f50-1",
                "metadataVersion": 1049,
                "key": "test-key-1",
                "settingIdentifier": "test-identifier"
            }
        ],
        "serverVersion": 1598545480231,
        "_rev": "v1",
        "_id": "test-id",
        "type": "SettingConfiguration"
    }
]

I have observed that in the V1 response, we only get the last updated server_version in the payload, and serverVersion values are not being populated in the individual settings record (i.e. 0 in all the settings).
While if we try to get settings by v2, we get the correct server versions.

[
    {
        "key": "test-key-3",
        "value": "test-value-3",
        "label": "test-label-3",
        "description": "test-desc-3",
        "uuid": "31111-7e6c-5c16-8cb3-8d1e35d97f50-1",
        "settingsId": "111",
        "settingIdentifier": "test-identifier",
        "settingMetadataId": "7639",
        "v1Settings": false,
        "resolveSettings": false,
        "documentId": "test-id",
        "metadataVersion": 1046,
        "serverVersion": 1598545480230,
        "type": "Setting"
    },
    {
        "key": "test-key-4",
        "value": "test-value-4",
        "label": "test-label-4",
        "description": "test-desc-4",
        "uuid": "41111-7e6c-5c16-8cb3-8d1e35d97f50-1",
        "settingsId": "111",
        "settingIdentifier": "test-identifier",
        "settingMetadataId": "7640",
        "v1Settings": false,
        "resolveSettings": false,
        "documentId": "test-id",
        "metadataVersion": 1047,
        "serverVersion": 1598545480230,
        "type": "Setting"
    },
    ...,
    {
        "key": "test-key-1",
        "value": "test-value-1-update-2ndtime",
        "label": "test-label-1",
        "description": "test-desc-1",
        "uuid": "11111-7e6c-5c16-8cb3-8d1e35d97f50-1",
        "settingsId": "111",
        "settingIdentifier": "test-identifier",
        "settingMetadataId": "7637",
        "v1Settings": false,
        "resolveSettings": false,
        "documentId": "test-id",
        "metadataVersion": 1049,
        "serverVersion": 1598545480231,
        "type": "Setting"
    }
]

FYI @dubdabasoduba @bennsimon

@rehammuzzamil
Copy link
Contributor

Hi @Wambere,
Can you please confirm which end-point is being used by the ETL teamv1 or v2to fetch settings?
Also, do you hit end-point to get all the settings at once or pass an identifier param?
cc : @bennsimon

@Wambere
Copy link

Wambere commented Jul 9, 2021

@rehammuzzamil the ETL is using v2, and no, we do not get all settings at once. We pull in batches using the version and limit

@rehammuzzamil
Copy link
Contributor

rehammuzzamil commented Jul 11, 2021

As per discussion with @bennsimon we will be exposing a new end-point /getAll on SettingResource (v2).
The idea of exposing the new end-point getAll is because it has been the convention on all other Resources/Controller for ETL stuff. This new end-point will hold a custom implementation in which we won't be clubbing the settings metadata against a particular identifier, rather return all the records as is in a list in increasing order of metadata_version.

cc: @dubdabasoduba @samkanga @Wambere

@dubdabasoduba
Copy link
Member

As per discussion with @bennsimon we will be exposing a new end-point /getAll on SettingResource (v2).
The idea of exposing the new end-point getAll is because it has been the convention on all other Resources/Controller for ETL stuff. This new end-point will hold a custom implementation in which we won't be clubbing the settings metadata against a particular identifier, rather return all the records as is in a list in increasing order of metadata_version.

cc: @dubdabasoduba @samkanga @Wambere

Where is the scoping/issue for this endpoint?

@rehammuzzamil
Copy link
Contributor

As per discussion with @bennsimon we will be exposing a new end-point /getAll on SettingResource (v2).
The idea of exposing the new end-point getAll is because it has been the convention on all other Resources/Controller for ETL stuff. This new end-point will hold a custom implementation in which we won't be clubbing the settings metadata against a particular identifier, rather return all the records as is in a list in increasing order of metadata_version.
cc: @dubdabasoduba @samkanga @Wambere

Where is the scoping/issue for this endpoint?

Let me create a new one.

@dubdabasoduba
Copy link
Member

As per discussion with @bennsimon we will be exposing a new end-point /getAll on SettingResource (v2).
The idea of exposing the new end-point getAll is because it has been the convention on all other Resources/Controller for ETL stuff. This new end-point will hold a custom implementation in which we won't be clubbing the settings metadata against a particular identifier, rather return all the records as is in a list in increasing order of metadata_version.
cc: @dubdabasoduba @samkanga @Wambere

Where is the scoping/issue for this endpoint?

Let me create a new one.

@rehammuzzamil please create it as detailed as possible. I don't think get all is used for ETL purposes alone. It's nice to do the base work done on this getAll piece.

@rehammuzzamil
Copy link
Contributor

Issue for the new /getAll end-point can be tracked here.
@samkanga @dubdabasoduba

@samkanga samkanga closed this as completed Oct 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants