Pattern Mining in Dynamic Graphs Karel Vaculík • Introduction • Frequent patterns • Anomalous patters • Future directions Outline • Introduction • Frequent patterns • Anomalous patters • Future directions Outline Introduction Introduction An example of a rule: Introduction • Citation / bibliographic networks • Collaboration patterns • Evolution of publication behaviour • Social networks • Leaving community / change of attributes after specific interactions (behaviour patterns) • Communication networks • Utilization of common communication patterns for productivity increase Applications • Introduction • Frequent patterns • Anomalous patters • Future directions Outline • Frequent pattern mining • Anomaly detection and explanation • Considers various types of changes: • Vertex addition / deletion • Edge addition / deletion • Change of vertex / edge labels • Undirected as well as directed edges; multiedges • Single dynamic graph or multiple dynamic graphs on input DGRMiner DGRMiner: Union Graph Representation +X ... addition -X ... deletion Y=>X ... Change from Y to X • Support of the freq. pattern: 3/5 (or 3 as absolute) • Confidence of the freq. pattern: 3/4 DGRMiner: Support and Confidence Based on gSpan algorithm: 1. Mine frequent change vertices 2. Mine frequent patterns built from change edges • in a depth-first-search manner • avoids duplicate patterns DGRMiner: Frequent Pattern Mining *gSpan (Yan & Han @ ICDM'02); modified image of the tree from the same paper • Introduction • Frequent patterns • Anomalous patters • Future directions Outline • Frequent pattern • Anomalous pattern = deviation from the frequent pattern Outlierness = 1 - confidence DGRMiner: Anomalies • How to compute support of these anomalies? Single-vertex Anomalies Frequent pattern Possible anomalies -A A, A=>C (where C ≠ A) A=>B A, -A, A=>C (where C ≠ B) +B !B • Frequent patterns without “additions”: simple enumeration of antecedents Non-trivial Anomalies • Frequent patterns with “additions”: Non-trivial Anomalies • Frequent patterns with “additions”: • Solution: Maximal common subgraphs of the freq. pattern and the input union graph Non-trivial Anomalies How to compute support of non-trivial anomalies? • Simple if the corresponding frequent pattern contains “non-additions” • Otherwise: Non-trivial Anomalies Enron Experiments Enron Experiments Enron (unique vertex labels) Experiments Resolution proofs Experiments • Introduction • Frequent patterns • Anomalous patters • Future directions Outline • DGRMiner at this moment: count at most one occurrence in each union graph • Counting as many occurrences as possible: Anti-monotonicity is broken! Support Definition *image taken from Mining Graph Data (2006) => solve the Maximum Independent Set problem on the graph of embeddings Support Definition *image taken from Mining Graph Data (2006) Patterns with deviating “additions”