When subgraph isomorphism is really hard, and why this matters for graph databases

Mccreesh, C. , Prosser, P. and Trimble, J. (2018) When subgraph isomorphism is really hard, and why this matters for graph databases. Journal of Artificial Intelligence Research, 61, pp. 723-759. (doi: 10.1613/jair.5768)

Full text not currently available from Enlighten.

Abstract

The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programming can operate comfortably on many real-world problem instances with thousands of vertices. However, they cannot handle arbitrary instances of this size. We show how to generate "really hard" random instances for subgraph isomorphism problems, which are computationally challenging with a couple of hundred vertices in the target, and only twenty pattern vertices. For the non-induced version of the problem, these instances lie on a satisfiable / unsatisfiable phase transition, whose location we can predict; for the induced variant, much richer behaviour is observed, and constrainedness gives a better measure of difficulty than does proximity to a phase transition. These results have practical consequences: we explain why the widely researched "filter / verify" indexing technique used in graph databases is founded upon a misunderstanding of the empirical hardness of NP-complete problems, and cannot be beneficial when paired with any reasonable subgraph isomorphism algorithm.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Trimble, Mr James and Mccreesh, Dr Ciaran and Prosser, Dr Patrick
Authors: Mccreesh, C., Prosser, P., and Trimble, J.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:Journal of Artificial Intelligence Research
Publisher:AI Access Foundation
ISSN:1076-9757
ISSN (Online):1943-5037
Copyright Holders:Copyright © 2018 AI Access Foundation
First Published:First published in Journal of Artificial Intelligence Research 61:723-759
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
608951Engineering and Physical Sciences Doctoral Training Grant 2012-16Mary Beth KneafseyEngineering and Physical Sciences Research Council (EPSRC)EP/K503058/1VPO VICE PRINCIPAL RESEARCH & ENTERPRISE
701101EPSRC 2015 DTPMary Beth KneafseyEngineering and Physical Sciences Research Council (EPSRC)EP/M508056/1RSI - RESEARCH STRATEGY & INNOVATION
3005250Modelling and Optimisation with GraphsPatrick ProsserEngineering and Physical Sciences Research Council (EPSRC)EP/P026842/1Computing Science