Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Change documents_limited deduplication
The documents_limited decorator appropriately splits queries that exceed the limit of 100 documents (as per "Transparency Platform RESTful API - user guide"). These splits occur before the data are tabulated and their later alignment is not straightforward. This commit changes duplicate removal based on index, to picking last valid value for each column within groups based on index. This is not an ideal solution but seems to work for the issues at hand. Firstly - if any duplicated indices are returned by the API then they are dropped invisibly for the user. Secondly - arguably, the spliting and concatenation should happen before tabulation. It would make more sense and be more efficient.
- Loading branch information