author: Simon de Vries
title: Growing Bonsai Trees
keywords:
topics: Algorithms and Data Structures
committee: Alfons Laarman ,
Tom van Dijk
end: June 2014

Description

Background

Our verification tools produce and store large integer vectors containing the values of variables of the program under verification. Because the number of these state vectors is typically very large, the system memory often is the bottleneck for verification tasks. The states exhibit much similarity, because typically each transition in the program under verification only touches few variables, thus changing only small parts in a large vector. This property can be exploited by storing the state vectors in tree structures. The tree records combinatorial information of sub-vectors within the state vector and thereby leverages sharing amongst these sub-vectors.

Problem Description

We implemented a binary tree structure and demonstrated very good compression for most inputs (approaching the optimum of a small constant size per vector regardless of its original length1). Yet some cases are below the optimum due to the variable ordering inside the state vectors which shifts the combinatorial complexity to the top of the tree.

A framework for the reordering of state variables allows us to experiment with different heuristics. Early results show that the "bad cases" can be optimized to also feature optimum compression ratios. We could not, however, establish a heuristic that works good in all cases; the current heuristic sometimes decreases performance in many other cases.

We want to find out the properties that influence the compression ratio and come up with a heuristic that performs good on average, to implement it within the framework and benchmark its performance. The scope of the project can be scaled to the schedule because we have several directions for this work to go in.

Tasks

  • Define a model which describes relevant parameters and their influence on the compression ratio.
  • Invent a suitable heuristic and implement this inside the framework.
  • Perform the necessary experiments to show that your heuristic indeed performs well for the average case.

Optional:

  • Identify dynamic parameters and implement feedback to optimize the heuristic at runtime.
  • Look at different tree configurations that might affect compression ratios.
  • Compare the compression and runtime performance of trees with that of binary decision diagrams (BDDs [3]).

Relevant literature

  1. Alfons Laarman, Jaco van de Pol and Michael Weber - Parallel Recursive State Compression for Free. Not published, available on demand.
  2. Stefan Blom, Bert Lisser, Jaco Pol van de, and Michael Weber. A database approach to distributed state space generation. In Sixth International Workshop on Parallel and Distributed Methods in verification, PDMC, pages 17-32, Enschede, July 2007. CTIT.
  3. Randal E. Bryant. Symbolic boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv., 24(3):293-318, 1992.

  1. The precise size of this constant is dependent on the logarithm of the number of states stored in the tree structure, resulting in a exact compression ratio of 2log^2(n)/(nk), where n is the number of vectors and k the length of the vectors.

Additional Resources

  1. The paper