visage

The goal of visage is to provide a set of tools for running visual inference experiments on linear regression models.

Installation

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("TengMCing/visage")
# remotes::install_url("https://github.com/TengMCing/visage/raw/master/built/visage_0.1.2.tar.gz")

library(visage)

Table of Content

Visual inference cubic linear model class CUBIC_MODEL
Visual inference heteroskedasticity linear model class HETER_MODEL
Closed form expression CLOSED_FORM

1. Visual inference cubic linear model class `CUBIC_MODEL`

1.1. Define a model

A visual inference cubic linear model is defined as:

CUBIC_MODEL$formula
#> y ~ 1 + (2 - c) * x + c * z + a * (((2 - c) * x)^2 + (c * z)^2) + 
#>     b * (((2 - c) * x)^3 + (c * z)^3) + e

where x and z are random variables, a, b and c are parameters and e is the random error term.

cubic_model() is a cubic model class constructor, which takes arguments a, b, c, sigma, x and z, where the first four are numeric values defined above, and x and z are random variable instances created by the random variable abstract base class constructor rand_var(). If we would like x and z to be uniform random variables ranged from −1 to 1, it can be achieved by using the uniform random variable class constructor rand_uniform(). It only takes two arguments which are the lower bound and the upper bound of the support.

mod <- cubic_model(a = -3, b = -3, c = 1, sigma = 1,
                   x = rand_uniform(-1, 1), z = rand_uniform(-1, 1))
mod
#> 
#> ── <CUBIC_MODEL object>
#> y = 1 + (2 - c) * x + c * z + a * (((2 - c) * x)^2 + (c * z)^2) + b * (((2 - c) * x)^3 + (c * z)^3) + e
#>  - x: <RAND_UNIFORM object>
#>    [a: -1, b: 1]
#>  - z: <RAND_UNIFORM object>
#>    [a: -1, b: 1]
#>  - e: <RAND_NORMAL object>
#>    [mu: 0, sigma: 1]
#> Parameters:
#>  - a: -3
#>  - b: -3
#>  - c: 1
#>  - sigma: 1

1.2. Generate random values from the model

An instance of cubic model class contains methods of simulating data and making residual plot. Method gen() returns a data frame containing realizations of x, z, y and e simulated from the model. The number of realizations depends on the integer argument n. In addition, a null model will be fitted using the simulated data and residuals and fitted values will be included in the returned data frame.

The null model is defined as:

CUBIC_MODEL$null_formula
#> y ~ x + z

Generates five realizations from the model:

mod$gen(n = 5)
#>             y          x          z          e      .resid    .fitted
#> 1  0.21516242  0.3044237  0.3515508 -0.5770481 -0.08375494  0.2989174
#> 2 -1.29530002 -0.2920189 -0.5882554 -0.8064607 -1.15503430 -0.1402657
#> 3 -0.05679982  0.6047610 -0.3570273  0.7020988 -0.46659273  0.4097929
#> 4  1.49700745  0.1992123 -0.7025260  1.5835437  1.35844735  0.1385601
#> 5 -0.03929408 -0.8653997  0.3970407  0.3921720  0.34693464 -0.3862287

1.3. Make a residual plot

Method plot() produce a ggplot object. It takes a data frame containing columns .resid and .fitted as input, along with a character argument type indicating the type of the data plot, and other aesthetic arguments such as size and alpha to control the appearance of the plot.

mod$plot(mod$gen(n = 100), type = "resid", size = 1)

1.4. Make a lineup

A lineup is a matrix of residual plots which can be produced by using the methods gen_lineup() and plot_lineup(). Method gen_lineup() takes the number of realizations n and the number of plots in a lineup k as inputs. Other than that, the method plot_lineup() has the same user interface as plot().

Generate a lineup consists of 2 plots and 8 realizations:

mod$gen_lineup(n = 8, k = 2)
#>              y           x          z           e      .resid    .fitted
#> 1   1.13201691  0.25774667  0.1727720  0.05718951  1.00603409  0.1259828
#> 2  -0.84045651  0.17447372 -0.5102047 -1.01497463 -1.17885168  0.3383952
#> 3   1.34391686 -0.05335108  0.2781568  0.42387216  1.76671435 -0.4227975
#> 4  -0.89775710 -0.77687642 -0.9795308  0.32155949  0.04210861 -0.9398657
#> 5  -1.41602383 -0.94138921 -0.4533301 -0.52844407  0.05173736 -1.4677612
#> 6  -3.02800982 -0.82579279  0.8822429  0.66707884 -1.06961706 -1.9583928
#> 7  -1.04956274 -0.86092082 -0.2687812 -0.45213360  0.38326119 -1.4328239
#> 8  -0.59102309  0.43067623  0.1551025 -1.29734353 -1.00138686  0.4103638
#> 9   1.07943579  0.25774667  0.1727720  0.05718951  0.95345297  0.1259828
#> 10  0.82225382  0.17447372 -0.5102047 -1.01497463  0.48385866  0.3383952
#> 11 -0.29410918 -0.05335108  0.2781568  0.42387216  0.12868830 -0.4227975
#> 12 -0.09230039 -0.77687642 -0.9795308  0.32155949  0.84756532 -0.9398657
#> 13 -1.35103053 -0.94138921 -0.4533301 -0.52844407  0.11673066 -1.4677612
#> 14 -1.24032172 -0.82579279  0.8822429  0.66707884  0.71807104 -1.9583928
#> 15 -3.31964345 -0.86092082 -0.2687812 -0.45213360 -1.88681952 -1.4328239
#> 16 -0.95118366  0.43067623  0.1551025 -1.29734353 -1.36154743  0.4103638
#>    test_name statistic   p_value k  null
#> 1     F-test   16.0826 0.1846342 1 FALSE
#> 2     F-test   16.0826 0.1846342 1 FALSE
#> 3     F-test   16.0826 0.1846342 1 FALSE
#> 4     F-test   16.0826 0.1846342 1 FALSE
#> 5     F-test   16.0826 0.1846342 1 FALSE
#> 6     F-test   16.0826 0.1846342 1 FALSE
#> 7     F-test   16.0826 0.1846342 1 FALSE
#> 8     F-test   16.0826 0.1846342 1 FALSE
#> 9     F-test   17.7532 0.1759428 2  TRUE
#> 10    F-test   17.7532 0.1759428 2  TRUE
#> 11    F-test   17.7532 0.1759428 2  TRUE
#> 12    F-test   17.7532 0.1759428 2  TRUE
#> 13    F-test   17.7532 0.1759428 2  TRUE
#> 14    F-test   17.7532 0.1759428 2  TRUE
#> 15    F-test   17.7532 0.1759428 2  TRUE
#> 16    F-test   17.7532 0.1759428 2  TRUE

Plot a lineup consists of 20 plots and 300 realizations:

mod$plot_lineup(mod$gen_lineup(n = 300, k = 20), type = "resid", alpha = 0.6)

2. Visual inference heteroskedasticity linear model class `HETER_MODEL`

A visual inference heteroskedasticity linear model is defined as:

HETER_MODEL$formula
#> y ~ 1 + x + sqrt(1 + (2 - abs(a)) * (x - a)^2 * b) * e

where x is random variables, a, b are parameters and e is the random error term.

Similar to the cubic model, the heteroskedasticity model could be built by the heteroskedasticity model class constructor heter_model(). This function takes three arguments as inputs, which are a, b and x. a and b are numeric parameters defined above. x needs to be a random variable object.

mod <- heter_model(a = 0, b = 16, x = rand_uniform(-1, 1))

mod
#> 
#> ── <HETER_MODEL object>
#> y = 1 + x + sqrt(1 + (2 - abs(a)) * (x - a)^2 * b) * e
#>  - x: <RAND_UNIFORM object>
#>    [a: -1, b: 1]
#>  - e: <RAND_NORMAL object>
#>    [mu: 0, sigma: 1]
#> Parameters:
#>  - a: 0
#>  - b: 16

Since both the cubic model class CUBIC_MODEL and the heteroskedasticity model class HETER_MODEL are inherited from the visual inference model class VI_MODEL, heteroskedasticity model object can be used in a similar way as cubic model object. The following codes give examples of the use of the object.

mod$gen(n = 5)
#>           y          x          e     .resid  .fitted
#> 1  1.646463  0.2717599  0.2043167 -0.5431367 2.189600
#> 2  5.049721 -0.2974273  2.2210523  3.4253827 1.624338
#> 3  2.606823  0.4016645  0.4854659  0.2882150 2.318608
#> 4  1.211346 -0.2691949  0.2637745 -0.4410298 1.652376
#> 5 -1.148795 -0.3414325 -0.8309877 -2.7294312 1.580636

mod$plot(mod$gen(n = 300), type = "resid", size = 1)

mod$plot_lineup(mod$gen_lineup(n = 300), alpha = 0.6)

3. Closed form expression `CLOSED_FORM`

3.1. Define and evaluate a closed form expression

“Closed form expression” is defined as a special type of R expression that can replace random variables in the expression with their realizations before evaluation. This may sound confusing, so let’s look at some examples.

closed_form() is the closed form expression constructor, which takes only one arguments expr required to be a formula. Considering the simplest case first, where the expression contains zero random variables, such as 1 + 3. It can be evaluated using the method compute() directly as a typical R expression.

cf <- closed_form(~1 + 3)
cf
#> 
#> ── <CLOSED_FORM object>
#> EXPR = 1 + 3
cf$compute()
#> [1] 4

However, if the closed form expression contains any random variables, it can not be evaluated directly since the values of random variables are unknown. In the following example, x is a uniform random variable ranged from -1 to 1, and x + 1 is an expression that can not be evaluated by method compute().

x <- rand_uniform(-1, 1)
cf <- closed_form(~1 + x)

In this case, the gen() method will try to generate n realizations from random variables, then evaluate the expression. In addition, if rhs_val = TRUE, realizations generated from random variables will be returned.

cf$gen(n = 5, rhs_val = TRUE)
#> $lhs
#> [1] 1.5457301 0.2943149 0.3015830 0.9379620 1.2245546
#> 
#> $rhs
#> $rhs$x
#> [1]  0.54573012 -0.70568513 -0.69841700 -0.06203798  0.22455463

3.2. Nested closed form expression

Nested closed form expression is useful when a hierarchical structure present in the expression.

Consider the following statistical model:

y = x + e, where x is a uniform random variable ranged from -1 to 1, and e = 1 + x^2 * z, where z is a standard normal random variable.

We can define this model using the following code:

z <- rand_normal(mu = 0, sigma = 1)
x <- rand_uniform(-1, 1)
e <- closed_form(~1 + x^2 * z)
y <- closed_form(~x + e)
y
#> 
#> ── <CLOSED_FORM object>
#> EXPR = x + e
#>  - x: <RAND_UNIFORM object>
#>    [a: -1, b: 1]
#>  - e: <CLOSED_FORM object>
#>    EXPR = 1 + x^2 * z
#>     - x: <RAND_UNIFORM object>
#>       [a: -1, b: 1]
#>     - z: <RAND_NORMAL object>
#>       [mu: 0, sigma: 1]

Note that y depends on e and e itself is a closed form expression. Both y and e depend on x. The method gen() will return values of x, z and e if rhs_val = TRUE.

y$gen(n = 5, rhs_val = TRUE)
#> $lhs
#> [1] 1.1474987 1.4635396 0.7027436 1.1855472 0.8148636
#> 
#> $rhs
#> $rhs$x
#> [1]  0.1612635  0.4645366 -0.2404316  0.1408382 -0.2830936
#> 
#> $rhs$z
#> [1] -0.529294784 -0.004620252 -0.983001235  2.254000284  1.222295999
#> 
#> $rhs$e
#> [1] 0.9862352 0.9990030 0.9431753 1.0447090 1.0979572

3.3. Turn gen() result into dataframe via method as_dataframe()

Results of gen() is either a vector or a list, which can be converted into a dataframe via method as_dataframe(). The lhs argument is for naming the left hand side result of the closed form expression.

y$as_dataframe(y$gen(n = 5, rhs_val = TRUE), lhs = "y")
#>            y          x          z         e
#> 1  0.6161044 -0.2916790 -1.0839234 0.9077834
#> 2  1.2954322  0.2510535  0.7041138 1.0443788
#> 3  1.2268052 -0.8636601  1.4619284 2.0904654
#> 4 -0.4685280 -0.8773207 -0.7681094 0.4087927
#> 5  1.2053441  0.1708320  1.1825894 1.0345122

3.4. Current limitiation of CLOSED_FORM

For closed_form(), any simple expressions can be provided, as long as all the symbols exist in the current environment. Note that this function tries to evaluate ALL the atomic symbols in the expression during initialization, and store the values in the object. Hence, calls like a$b will also be decomposed as $, a and b, where b will be interpreted as a variable b exists in the current environment. Therefore, use ~a[["b"]] instead of ~a$b. And pre-define function like myfun = function() 1, then use it in the expression ~myfun().

Name		Name	Last commit message	Last commit date
Latest commit History 324 Commits
.github		.github
R		R
built		built
data-raw		data-raw
data		data
demo		demo
docs		docs
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
codecov.yml		codecov.yml
visage.Rproj		visage.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

visage

Installation

Table of Content

1. Visual inference cubic linear model class `CUBIC_MODEL`

2. Visual inference heteroskedasticity linear model class `HETER_MODEL`

3. Closed form expression `CLOSED_FORM`

About

Licenses found

Releases

Packages

Languages

License

Licenses found

TengMCing/visage

Folders and files

Latest commit

History

Repository files navigation

visage

Installation

Table of Content

1. Visual inference cubic linear model class CUBIC_MODEL

2. Visual inference heteroskedasticity linear model class HETER_MODEL

3. Closed form expression CLOSED_FORM

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Visual inference cubic linear model class `CUBIC_MODEL`

2. Visual inference heteroskedasticity linear model class `HETER_MODEL`

3. Closed form expression `CLOSED_FORM`

Packages