Recent Publications

All Journal Publications

Graphs and networks are prevalent in modeling relational datasets from many fields of research. By using iterative solvers to approximate graph measures (specifically Katz Centrality), we can obtain a ranking vector consisting of a number for each vertex in the graph identifying its relative importance. We use the residual to accurately estimate how much of the ranking from an approximate solution matches the ranking given by the exact solution. Using probabilistic matrix norms and applying numerical analysis to the computation of Katz Centrality, we obtain bounds on the accuracy of the approximation compared to the exact solution with respect to the highly ranked nodes. This relates the numerical accuracy of the linear solver to the data analysis accuracy of finding the correct ranking. In particular, we answer the question of which pairwise rankings are reliable given an approximate solution to the linear system. Experiments on many real-world networks up to several million vertices and several hundred million edges validate our theory and show that we are able to accurately estimate large portions of the approximation. By analyzing convergence error, we develop confidence in the ranking schemes of data mining.
ISC-2017

Increasing volumes of data and the desire for real-time query capability make the development of efficient streaming algorithms for data analytics valuable. Streaming graph algorithms that avoid unnecessary recomputation through clever application of data dependency analysis are often more complex to derive than their static counterparts. This paper discusses a method to derive algorithms for streaming graph analysis from static formulations Combining tuned graph algorithms building blocks with an appropriate functional language, a graph query planner should be able to correctly implement most static and streaming versions of an algorithm from a single mathematical formulation. We provide a detailed analysis for the case of updating triangle counts in a streaming graph using linear algebra and an experimental evaluation in Julia.
IPDPS-2017

Many common methods for data analysis rely on linear algebra. We provide new results connecting data analysis error to numerical accuracy in the context of spectral graph partitioning. We provide pointwise convergence guarantees so that spectral blends (linear combinations of eigenvectors) can be employed to solve data analysis problems with confidence in their accuracy. We apply this theory to an accessible model problem, the ring of cliques, by deriving the relevant eigenpairs and finding necessary and sufficient solver tolerances. Analysis of the ring of cliques provides an upper bound on eigensolver tolerances for graph partitioning problems. These results bridge the gap between linear algebra based data analysis methods and the convergence theory of iterative approximation methods. These results explain how the combinatorial structure of a problem can be recovered much faster than numerically accurate solutions to the associated linear algebra problem.
Compl. Networks

Recent & Upcoming Talks

All Presentations

Recent Posts

This is an example of applying Non-negative Matrix Factorization a corpus of emails and to extract topic structure. We use the email currently stored on my desktop and find patterns of email. It turns out that most of my email is sent by machines!

CONTINUE READING

Research

Software and Data

Julia Graphs

JuliaGraphs is the primary organization dedicated to the advancement of graph theory and algorithms in the Julia Programming language. The flagship project is LightGraphs.jl the premier graph library in the Julia Ecosystem.

Stinger Graph Analysis

Dynamic graphs are all around us. Social networks containing interpersonal relationships and communication patterns. Information on the Internet, Wikipedia, and other datasources. Disease spread networks and bioinformatics problems. Business intelligence and consumer behavior. The right software can help to understand the structure and membership of these networks and many others as they change at speeds of thousands to millions of updates per second.

Contact

  • https://www.google.com/recaptcha/mailhide/d?k=01Pam2gQXRzPV0FfCqnFgrjw==&c=Vpo3xDQlWVcGfG4zf2BluSZ3z10xitKMiLF_NLmw9gs=
  • 75 5th Street NW, Atlanta, GA
  • By appt only.