DEFINITION 1(REVERSE REACHABLE SET).Let v be a node in G, and g be a graph obtained by removing each edge e in G with 1 - p(e) probability. The reverse reachable (RR) set for v in g is the set of nodes in g that can reach v.(That is, for each node u in the RR set, there is a directed path from u to v in g.)
RIS(Reverse Influence Sampling) algorithm runs in two steps:
1.Generate a certain number of random RR sets from G.
2.Consider the maximum coverage problem of selecting k nodes to cover the maximum number of RR sets generaed.Using the standard greedy algorithm to derive a (1-1/e)-approximate solution Sk for the problem. Return Sk as the final result.
The rationale of RIS is as follows:If a node u appears in a large number of RR sets, then it should have a high probability to activate many nodes under the IC model; in that case, u's expected spread should be large. By the same reasoning, if a size-k node set Sk covers most RR sets, then Sk is likely to have the maximum expected spread among alll size-k node sets in G.In that case,Sk should be a good solution to influence maximization.
Compared with Greedy, RIS can be more efficient as it avoids estimating the expecting spreads of a large number of node sets. That said, we need to carefully control the number of random RR sets generated in Step 1 of RIS, so as to strike a balance between efficiency and accuracy. Towards this end, Brogs et al. propose a threshold-based approach: they allow RIS to keep generating RR sets, until the total number of nodes and edges examined during the generation process reaches a pre-defined threshold τ.They show that when τ is set to Θ(k(m+n)log n / ε^3), RIS runs in time linear to τ, and it returns a (1 - 1/e - ε)-approximate solution to the influence maximization problem with at least a constant probability.They then provide an algorithm that amplifies the success probability to at lease 1 - 1/n^l, by increasing τ by a factor of l,and repeating RIS for Ω(l log n) times.