*corresponding author, alphabetical author-list

- TIMES: Temporal Information Maximally Extracted from Structures

Abram Magner (co-primary author), Jithin K. Sreedharan (co-primary author), Ananth Grama, and Wojciech Szpankowski

*Accepted in ACM International Conference on World Wide Web (WWW) 2018, (acceptance rate: 14.8%) *

[abstract]
[paper]
Inferring the node arrival sequence from a snapshot of a dynamic network is an important problem, with applications ranging from identifying sources of contagion to flow of capital in financial transaction networks. Variants of this problem have received significant recent research attention, including results on infeasibility of solution for prior formulations. We present a new formulation of the problem that admits probabilistic solutions for broad classes of dynamic network models. Instantiating our framework for a preferential attachment model, we present effectively computable and practically tight bounds on the tradeoff curve between optimal achievable precision and density/recall. We also present efficient algorithms for partial recovery of node arrival orders and derive theoretical and empirical performance bounds on the precision and density/recall of our methods in comparison to the best possible. We validate our methods through experiments on both synthetic and real networks to show that their performance is robust to model changes, and that they yield excellent results in practice. We also demonstrate their utility in the context of a novel application in analysis of the human brain connectome to draw new insights into the functional and structural organization and evolution of the human brain.

- Revisiting Random Walk based Sampling in Networks: Evasion of Burn-in Period and Frequent Regenerations,

Konstantin Avrachenkov, Vivek S. Borkar, Arun Kadavankandy and Jithin K. Sreedharan*

*Accepted in Computational Social Networks, Springer, 2018 *

[abstract]
In the framework of network sampling, random walk (RW) based estimation techniques provide many pragmatic solutions while uncovering the unknown network as little as possible. Despite several theoretical advances in this area, RW based sampling techniques usually make a strong assumption that the samples are in stationary regime, and hence are impelled to leave out the samples collected before the burn-in period. This work proposes two sampling schemes without burn-in constraint to estimate the average of an arbitrary function defined on the network nodes, for e.g. the average age of users in a social network. The central idea of the algorithms lies in exploiting regeneration of RWs at revisits to an aggregated super-node or to a set of nodes and in strategies to enhance the frequency of such regenerations either by contracting the graph or by making the hitting set larger. Our first algorithm, which is based on Reinforcement Learning (RL), takes advantage of the regeneration of RWs, and it uses stochastic approximation to derive an estimator. This method can be seen as intermediate between purely stochastic Markov Chain Monte Carlo iterations and deterministic relative value iterations. We study this method via simulations on real networks and observe that its trajectories are much more stable than those of standard random walk based estimation procedures, and its error performance is comparable to that of respondent driven sampling (RDS) which has a smaller asymptotic variance than many other estimators. The second algorithm, which we call the RT estimator, is a modified form of RDS that accommodates the idea of regeneration. Simulation studies show that the mean squared error of RT estimator decays much faster than that of RDS with time.

- Hamiltonian System Approach to Eigenvalue-Eigenvector Problem in Networks

Konstantin Avrachenkov, Philippe Jacquet and Jithin K. Sreedharan*

*IEEE International Workshop on Multidimensional (nD) Systems (nDS) 2017 *

[abstract]
[paper]
Because of the significant increase in size and complexity of the networks, the distributed computation of eigenvalues and eigenvectors of graph matrices has become very challenging and yet it remains as important as before. In this paper we develop efficient distributed algorithms to detect, with higher resolution, closely situated eigenvalues and corresponding eigenvectors of symmetric graph matrices. We model the system of graph spectral computation as physical systems with Lagrangian and Hamiltonian dynamics. The spectrum of Laplacian matrix, in particular, is framed as a classical spring-mass system with Lagrangian dynamics. The spectrum of any general symmetric graph matrix turns out to have a simple connection with quantum systems and it can be thus formulated as a solution to a Schr\"odinger-type differential equation. Taking into account the higher resolution requirement in the spectrum computation and the related stability issues in the numerical solution of the underlying differential equation, we propose the application of symplectic integrators to the calculation of eigenspectrum. The effectiveness of the proposed techniques is demonstrated with numerical simulations on real-world networks of different sizes and complexities.

- Recovery of Vertex Orderings in Dynamic Graphs

Abram Magner, Ananth Grama, Jithin K. Sreedharan, and Wojciech Szpankowski

*IEEE International Symposium on Information Theory (ISIT) 2017 *

[abstract]
[paper]
Many networks in the real world are dynamic in nature: nodes enter, exit, and make and break connections with
one another as time passes. Several random graph models of these networks are such that nodes have well-defined arrival times. It is natural to ask if, for a given random graph model, we can recover the arrival order of nodes, given information about the structure of the graph. In this work, we give a rigorous formulation of the problem in a statistical learning framework and tie its feasibility, for a broad class of models, to several sets of permutations
associated with the symmetries of the random graph model and graphs generated by it. Moreover, we show how the same
quantities are fundamental to the study of the information content of graph structures. We then apply our general results to the special cases of the Erd˝os-R´enyi and preferential attachment models to derive strong inapproximability results.

- Inference in OSNs via Lightweight Partial Crawls

Konstantin Avrachenkov, Bruno Ribeiro and Jithin K. Sreedharan*

*ACM SIGMETRICS / PERFORMANCE 2016, (acceptance rate: 13.5%) *

[abstract]
[paper]
[presentation]
[code]
Are Online Social Network (OSN) A users more likely to form friendships with those with similar attributes? Do users at an OSN B score content more favorably than OSN C users? Such questions frequently arise in the context of Social Network Analysis (SNA) but often crawling an OSN network via its Application Programming Interface (API) is the only way to gather data from a third party. To date, these partial API crawls are the majority of public datasets and the synonym of lack of statistical guarantees in incomplete-data comparisons, severely limiting SNA research progress. Using regenerative properties of the random walks, we propose estimation techniques based on short crawls that have proven statistical guarantees. Moreover, our short crawls can be implemented in massively distributed algorithms. We also provide an adaptive crawler that makes our method parameter-free, significantly improving our statistical guarantees. We then derive the Bayesian approximation of the posterior of the estimates, and in addition, obtain an estimator for the expected value of node and edge statistics in an equivalent configuration model or Chung-Lu random graph model of the given network (where nodes are connected randomly) and use it as a basis for testing null hypotheses. The theoretical results are supported with simulations on a variety of real-world networks.

- Distributed Spectral Decomposition in Networks by Complex Diffusion and Quantum Random Walk

Konstantin Avrachenkov, Philippe Jacquet and Jithin K. Sreedharan*

*IEEE International Conference on Computer Communication (INFOCOM) 2016, (acceptance rate: 18.25%) *

[abstract]
[paper]
[presentation]
[poster (Bell Labs Future days, Paris)]
In this paper we address the problem of finding top $k$ eigenvalues and corresponding eigenvectors of symmetric graph matrices in networks in a distributed way. We propose a novel idea called complex power iterations in order to decompose the eigenvalues and eigenvectors at node level, analogous to time-frequency analysis in signal processing. At each node, eigenvalues correspond to the frequencies of spectral peaks and respective eigenvector components are the amplitudes at those points. Based on complex power iterations and motivated from fluid diffusion processes in networks, we devise distributed algorithms with different orders of approximation. We also introduce a Monte Carlo technique with gossiping which substantially reduces the computational overhead. An equivalent parallel random walk algorithm is also presented. We validate the algorithms with simulations on real-world networks. Our formulation of the spectral decomposition can be easily adapted to a simple algorithm based on quantum random walks. With the advent of quantum computing, the proposed quantum algorithm will be extremely useful.

- Comparison of random walk based techniques for estimating network averages

Konstantin Avrachenkov, Vivek S. Borkar, Arun Kadavankandy and and Jithin K. Sreedharan*

*5th International Conference on Computational Social Networks (CSoNet) 2016 *

[abstract]
[paper]
[presentation]
[code]
Function estimation on Online Social Networks (OSN) is an important field of study in complex network analysis. An efficient way to do function estimation on large networks is to use random walks. We can then defer to the extensive theory of Markov chains to do error analysis of these estimators. In this work we compare two existing techniques, Metropolis-Hastings MCMC and Respondent-Driven Sampling, that use random walks to do function estimation and compare them with a new reinforcement learning based technique. We provide both theoretical and empirical analyses for the estimators we consider.

- Distribution and Dependence of Extremes in Network Sampling Processes

Konstantin Avrachenkov, Natalia M. Markovich and Jithin K. Sreedharan*

[abstract]
[paper]
[presentation]
*Computational Social Networks, Springer, 2015*
*Third International IEEE Workshop on Complex Networks and their Applications, Nov 2014*

We explore the dependence structure in the sampled sequence of complex networks. We consider randomized algorithms to sample the nodes and study extremal properties in any associated stationary sequence of characteristics of interest like node degrees, number of followers or income of the nodes in Online Social Networks etc, which satisfy two mixing conditions. Several useful extremes of the sampled sequence like $k$th largest value, clusters of exceedances over a threshold, first hitting time of a large value etc are investigated. We abstract the dependence and the statistics of extremes into a single parameter that appears in Extreme Value Theory, called Extremal Index (EI). In this work, we derive this parameter analytically and also estimate it empirically. We propose the use of EI as a parameter to compare different sampling procedures. As a specific example, degree correlations between neighboring nodes are studied in detail with three prominent random walks as sampling techniques.

- Spectrum sensing using distributed sequential detection via noisy reporting MAC

Jithin K. Sreedharan and Vinod Sharma

*Signal Processing (Elsevier), Jan 2015 *

[abstract]
[paper]
This paper considers cooperative spectrum sensing algorithms for Cognitive Radios which focus on reducing the number of samples to make a reliable detection. We propose algorithms based on decentralized sequential hypothesis testing in which the Cognitive Radios sequentially collect the observations, make local decisions and send them to the fusion center for further processing to make a final decision on spectrum usage. The reporting channel between the Cognitive Radios and the fusion center is assumed more realistically as a Multiple Access Channel (MAC) with receiver noise. Furthermore the communication for reporting is limited, thereby reducing the communication cost. We start with an algorithm where the fusion center uses an SPRT-like (Sequential Probability Ratio Test) procedure and theoretically analyze its performance. Asymptotically, its performance is close to the optimal centralized test without fusion center noise. We further modify this algorithm to improve its performance at practical operating points. Later we generalize these algorithms to handle uncertainties in SNR and fading.

- Nonparametric distributed sequential detection via universal source coding

Jithin K. Sreedharan and Vinod Sharma

*Information Theory and Applications Workshop (ITA), Feb 2013 *

[abstract]
[paper]
We consider nonparametric or universal sequential
hypothesis testing when the distribution under the null hypothesis
is fully known but the alternate hypothesis corresponds to some
other unknown distribution. These algorithms are primarily
motivated from spectrum sensing in Cognitive Radios and
intruder detection in wireless sensor networks. We use easily
implementable universal lossless source codes to propose simple
algorithms for such a setup. The algorithms are first proposed for
discrete alphabet. Their performance and asymptotic properties
are studied theoretically. Later these are extended to continuous
alphabets. Their performance with two well known universal
source codes, Lempel-Ziv code and KT-estimator with Arithmetic
Encoder are compared. These algorithms are also compared with
the tests using various other nonparametric estimators. Finally a
decentralized version utilizing spatial diversity is also proposed
and analysed.

- Spectrum Sensing via Universal Source Coding

Jithin K. Sreedharan and Vinod Sharma

*IEEE Global Communications Conference (GLOBECOM), Dec 2012*

[abstract]
[paper]
We consider nonparametric sequential hypothesis
testing when the distribution under null hypothesis is fully
known and the alternate hypothesis corresponds to some other
unknown distribution. We use easily implementable universal
lossless source codes to propose simple algorithms for such a
setup. These algorithms are motivated from spectrum sensing
application in Cognitive Radios. Universal sequential hypothesis
testing using Lempel Ziv codes and Krichevsky-Trofimov estimator
with Arithmetic Encoder are considered and compared for
different distributions. Cooperative spectrum sensing with multiple
Cognitive Radios using universal codes is also considered.

- Novel algorithms for distributed sequential hypothesis testing

K. S. Jithin and Vinod Sharma

*49th Annual Allerton Conference on Communication, Control and Computing, Sep 2011 *

[abstract]
[paper]
This paper considers sequential hypothesis testing
in a decentralized framework. We start with two simple decentralized
sequential hypothesis testing algorithms. One of which is
later proved to be asymptotically Bayes optimal. We also consider
composite versions of decentralized sequential hypothesis testing.
A novel nonparametric version for decentralized sequential
hypothesis testing using universal source coding theory is developed.
Finally we design a simple decentralized multihypothesis
sequential detection algorithm.

- A novel algorithm for cooperative distributed sequential spectrum sensing in Cognitive Radio

Jithin K. Sreedharan and Vinod Sharma

* IEEE Wireless Communications and Networking Conference (WCNC), Mar 2011.*

[abstract]
[paper]
This paper considers cooperative spectrum sensing
in Cognitive Radios. In our previous work we have developed
DualSPRT, a distributed algorithm for cooperative spectrum
sensing using Sequential Probability Ratio Test (SPRT) at the
Cognitive Radios as well as at the fusion center. This algorithm
works well, but is not optimal. In this paper we propose an
improved algorithm- SPRT-CSPRT, which is motivated from Cumulative
Sum Procedures (CUSUM). We analyse it theoretically.
We also modify this algorithm to handle uncertainties in SNR’s
and fading.

- Cooperative distributed sequential spectrum sensing

K. S. Jithin, Vinod Sharma, and Raghav Gopalarathnam

*IEEE National Conference on Communication (NCC), Jan 2011*

[abstract]
[paper]
We consider cooperative spectrum sensing for cognitive
radios. We develop an energy efficient detector with low
detection delay using sequential hypothesis testing. Sequential
Probability Ratio Test (SPRT) is used at both the local nodes
and the fusion center. We also analyse the performance of
this algorithm and compare with the simulations. Modelling
uncertainties in the distribution parameters are considered. Slow
fading with and without perfect channel state information at the
cognitive radios is taken into account.