Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare 0.1.0 release #3

Open
seabbs opened this issue Sep 5, 2023 · 10 comments
Open

Prepare 0.1.0 release #3

seabbs opened this issue Sep 5, 2023 · 10 comments

Comments

@seabbs
Copy link
Contributor

seabbs commented Sep 5, 2023

We have had a fair few months with no progress on this (for understandable reasons). Shall we setup a meeting with those interested in contributing and hash out some issues that we can PR against to at least get an MVP version that can replace the functionality in epinowcast.

@pearsonca
Copy link
Collaborator

pearsonca commented Dec 1, 2023

@seabbs could use a second set of eyes on https://github.com/epinowcast/coerceDT/blob/main/R/coerceDT.R - thoughts on the interface offered?

Not quite done w/ implement + test for it, but am thinking:

  • coerceDT = select, drop arguments mirroring fread, with a minor extension of coercion
  • checkDT = required, forbidden as previous
  • makeDT = coerceDT => checkDT combined

@pearsonca
Copy link
Collaborator

@seabbs I think coerceDT + tests is basically done, moving on to checkDT.

@pearsonca pearsonca changed the title Moving development forward Prepare 0.1.0 release Dec 3, 2023
@pearsonca
Copy link
Collaborator

@seabbs this is now approaching 0.1.0 release readiness.

How do we want to check that it works how we want it to work? Propose an epinowcast & scoringutils branch => introduce this as a dependency => modify those library codebases (likely also this library code to do more of what we want for them) => iterate until we're happy with what this does for those libraries.

Related: there's definitely some documentation issues to deal with - I think that's best dealt with by another set of eyes, at least to identify the WTF bits. Useful candidates for that?

@pearsonca
Copy link
Collaborator

pearsonca commented Dec 4, 2023

Tagging @nikosbosse - I understand you've got an internal helper function like this in scoringutils. If it potentially makes sense to substitute this in, please advise (with the helper function name(s)) and I'll have a look.

@nikosbosse
Copy link

nikosbosse commented Dec 4, 2023

The version in scoringutils is merely this:

scoringutils:::ensure_data.table
function(data) {
  if (is.data.table(data)) {
    data <- copy(data)
  } else {
    data <- as.data.table(data)
  }
  return(data)
}

Maybe there is room for more sophistication :)

Regarding the version you linked to: It seems quite complicated to me and the function does a lot of different things (selecting / dropping columns, reading in data, conversion to data.table, selecting whether or not that happens by reference...). For me as a newbie user I think it would be helpful to separate these things a bit more.

Also since we're currently reworking scoringutils in terms of S3 methods I now see them everywhere... But having coerceDT be a generic seems like it could be a good idea.

@pearsonca
Copy link
Collaborator

The version in scoringutils is merely this:

scoringutils:::ensure_data.table
function(data) {
  if (is.data.table(data)) {
    data <- copy(data)
  } else {
    data <- as.data.table(data)
  }
  return(data)
}

Maybe there is room for more sophistication :)

Indeed! In terms of just replacing that function, that would literally just be: ensure_data.table(data) => coerceDT(data) (though now data could be a file path or any other string fread just handles).

If you have places internally in scoringutils where you don't need to copy, those would become coerceDT(data, copy = FALSE).

@pearsonca
Copy link
Collaborator

I did fiddle a bit with S3 methods - found it easier to just handle dispatch based on type in the internal function logic.

@pearsonca
Copy link
Collaborator

pearsonca commented Dec 4, 2023

@nikosbosse had a quick peek at scoringutils - for example check_forecasts might become something like

check_forecasts <- function(data) {

  # create lists to store results ----------------------------------------------
  out <- list()
  warnings <- list()
  errors <- list()
  messages <- list()

  data <- makeDT(data, require = c("true_value", "prediction"), forbid = available_metrics())

  # ... other more complex checks, starting from the bit looking at models

@nikosbosse
Copy link

The helper function, ensure_data.table linked above is currently on the develop branch - the current version does the checks for required columns independently of that, but in principle it could also be a single function (there is currently an internal function check_columns_present())

@pearsonca
Copy link
Collaborator

@seabbs per discussion re nomenclature - how about asDT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants