Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impute_lr in combination with validator #2

Open
smartie5 opened this issue Apr 2, 2019 · 9 comments
Open

impute_lr in combination with validator #2

smartie5 opened this issue Apr 2, 2019 · 9 comments
Labels

Comments

@smartie5
Copy link

smartie5 commented Apr 2, 2019

I have some questions concerning the use of deductive in combination with validate. Here is what I tried:
library(validate) (version 0.2.6)
library(deductive) (version 0.1.2)
dat <- data.frame(a=NA, b=NA, c=5, d=5)
rules <- validator(var_group(a,b,c,d) >= 0, a+b+c==d)
impute_lr(dat, rules)

Here NAs remain NAs. Using rules2 <- validator(a >= 0, b >= 0, c >= 0, d >= 0, a+b+c==d) impute_lr(dat, rules2) instead of rules gives the result I expected. Why isn't the validator with var_group working the same way?

Imputing this dataframe gives an undesired result as well:
dat2 <- data.frame(a=NA, b=NA, c=10, d=9)
impute_lr(dat2, rules2)
In my opinion leaving NA in this case would be better than imputing values which fail the editrules.

In my case using sum inside the validator function would be a benefit like
rules3 <- validator(a >= 0, b >= 0, c >= 0, d >= 0, sum(a,b,c, na.rm=T) == d)

@markvanderloo
Copy link
Member

Thanks for the report, I confirmed this and it must be a bug.

> dat <- data.frame(a=NA, b=NA, c=5, d=5) 
> rules <- validator(var_group(a,b,c,d) >= 0, a+b+c==d)
> impute_lr(dat, rules)
   a  b c d
1 NA NA 5 5
> rules2 <- validator(a >= 0, b >= 0, c >= 0, d >= 0, a+b+c==d)
> impute_lr(dat, rules2)
  a b c d
1 0 0 5 5
> dat2 <- data.frame(a=NA, b=NA, c=10, d=9)
> impute_lr(dat2, rules2)
   a  b  c d
1 -1 -1 10 9

@markvanderloo
Copy link
Member

I can now answer this question:
Why isn't the validator with var_group working the same way?

There's a bit of a dilemma here in how to 'count' the number of rules. Given

rules <- validator(var_group(a,b,c,d)>=0, a + b + c ==d)

We can request validate to tell us which of these rules are linear. Currently this functionality
returns a logical of length length(rules), so the 1st (using the vargroup) is counted as one.

This can be worked around in impute_lr with not too much trouble. Will fix.

@markvanderloo
Copy link
Member

markvanderloo commented Apr 10, 2019

fixed both issues. Will push an update to CRAN soon.

@smartie5
Copy link
Author

Thank you so far. Unfortunately here is still a problem. I still get values which fail the edit rules

> dat3 <- data.frame(a=NA, b=1, c=10, d=9)
> rules <- validator(var_group(a,b,c,d) >= 0, a+b+c==d)
> impute_lr(dat3, rules)
   a b  c d
1 -2 1 10 9

And I have found another scenario which is unclear to me:

> dat4 <- data.frame(a=NA, b=NA, c=5, d=5, y=10, z=9)
> rules4 <- validator(var_group(a,b,c,d,y,z) >= 0, a+b+c==d, y==z)
> impute_lr(dat4, rules4)
  a  b c d  y z
1 NA NA 5 5 10 9
>
> dat5 <- data.frame(a=NA, c=5, d=5, y=10, z=9)
> rules5 <- validator(var_group(a,c,d,y,z) >= 0, a+c==d, y==z)
> impute_lr(dat5, rules5)
  a c d  y z
1 0 5 5 10 9
>
> dat6 <- data.frame(a=NA, b=NA, c=5, d=5, y=10, z=10)
> impute_lr(dat6, rules4)
  a b c d  y  z
1 0 0 5 5 10 10

There is no imputation done in dat4, but dat5 and dat6 is imputed.

@markvanderloo markvanderloo reopened this Apr 18, 2019
@markvanderloo
Copy link
Member

Thanks! That is another issue.

@markvanderloo
Copy link
Member

I think your question may be related to the this rspa issue.

If you perform error localization and remove the assigned values (using e.g. errorlocate::replace_errors()) before deductive imputation, no inconsistent values will be imputed.

@smartie5
Copy link
Author

smartie5 commented Jul 4, 2019

I get error messages using validator with if conditions and impute_lr. Three different errors for different rules.

library(validate)     # version 0.2.6
library(deductive)    # version 0.1.3

df <- data.frame(a=0, b=NA)

rules1 <- validator(if(a==0) b==0)
rules2 <- validator(a>=0, if(a==0) b==0)
rules3 <- validator(a>=0, b>=0, if(a==0) b==0)

impute_lr(df, rules1)
# Error in operatorsign[operators] : invalid subscript type 'list'
impute_lr(df, rules2)
# Error in apply(x, 2, impute_range_x, A = A, b = b, neq = neq, nleq = nleq,  :  dim(X) must have a positive length
impute_lr(df, rules3)
# Error in svd(A) : a dimension is zero

@markvanderloo
Copy link
Member

Thanks for the report. Will fix!

@smartie5
Copy link
Author

Hello!
Next use case for me. Do you plan to update this package in the near future? Would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants