Skip to content

Results Visualization

punkcpp edited this page Sep 1, 2016 · 12 revisions
Results Visualization

plot_utils: sources files for generating plots (R with ggplot2 is required).

Explanation of Result Files Generated

  1. Only AllResults.txt and curve_c(50).txt are used for plotting.

  2. AllResults.txt:
    The first line "Hyper-graph time: 1396278" means building the hyper graph takes 1396278ms.

    B=10
    IM: 45023.54 6252.5048907138 1403641.2 0.611411976377969
    UC: 46404.13 8963.0942677794 1571590.8
    CD: 49944.56 7517.24278751192 1585022.4

    The above lists results of each algorithm when the budget B=10. For each line, the first number and the second number are influence spread and standard deviation of influence spread estimated by Monte Carlo simulations. The third number is the time costed by the algorithm (in ms). For IM, the fourth number is the approximation lower bound.

  3. B=10.txt:
    "IM 1403641.2 10" means running the influence maximization algorithm takes 1403641.2ms and the algorithm returns 10 users that we should assign non-zero discounts to. Then the follwing 10 lines are like "25084 1 1", which means we should assign 100% (the second number) to user 25084 and the probability user 25084 becomes a seed is 1(the third number).

  4. curve_c(50).txt:
    "0.05 55516.4029581446" means when the unified discount is 0.05, the best influence spread returned by the Unified Discount algorithm is 55516.4029581446.

How to Plot

Each .R file is self-explained by its name for generating plots that we used in the paper.

Basically,

  1. Change the location variable in .R file to the result data.

  2. Run the .R file (RStudio is recommended), and follow the prompts in the console.

  3. Sometimes, you need change the y axis scale accordingly to get better rendering plots.

    scale_y_continuous(limits=c(lower_limit, upper_limit))

Data for Figures and Tables

  1. Figure 3: Influence Spread
    Data can be found in AllResults.txt.

  2. Figure 4: Approximation Lower Bound
    Data can be found in AllResults.txt.

  3. Figure 5: Influence Spread w.r.t. unified discount c
    Data can be found in curve_c(50).txt.

  4. Table 3: Effect of the parameter search step in calculating the best unified discount c.
    For generating data in this table, we need to run the program twice on each dataset by setting the parameter #DStep 0.01 and 0.05 respectively in the config file. Then the data in the column "1% Search Step" and "5% Search Step" can be found in AllResults.txt (influence spread of UC)

  5. Figure 6: Running time
    Data can be found in AllResults.txt.

  6. Table 4: Sensitivity to user purchase probability curves.
    For generating data in this table, we need to run the program three times on each dataset by using 3 different func_assign file (parameter #function assign in the config file). The func_assign files we used in the paper can be found in here. FunctionAssign(65,20,15) means in all func_assign files of this directory, there are 65% discount sensitive users, 20% benchmark users and 15% discount insensitive users.