The Second DIKU-IST Joint Workshop on Foundations of Software : Abstracts

The Flow of Data and the Complexity of Algorithms

Neil D. Jones (DIKU)

The purpose of this work is automatically to recognise imperative programs that run in time polynomially bounded by their input values. The effect of a command C is approximated by a matrix M where M_ij describes the way the value of variable X_j after executing C depends on the value that X_i had before execution. If-then-else is modeled by matrix sum, sequential composition is modeled by matrix product, and iteration is modeled by transitive closure plus a careful analysis of variable self-dependence. Compositional inference rules are given such that if |- C:M is provable, then C executes in ptime; and a form of completeness is shown. The work is joint with Lars Kristiansen (Oslo), and is related to earlier work by C.C. Fredriksen (now at IST) and K.-H. Niggl. A preliminary version appeared in CIE 2005, LNCS 3526.

A Unifying Framework for Several Path-Summarizing Analyses

James Avery (DIKU)

This talk will describe efforts in finding a unified framework that generalizes several conceptually similar program analyses, e.g. all known variants of size-change termination, and other analyses based on interpreting computation path-effects in finite abstract domains.

Is it possible to isolate shared underlying principles to these analyses? The current work aims to identify common algebraic structure of which these several essentially similar analyses arise as special cases. It is the author's hope that this will be beneficial in the following:

A common domain may allow us to compare the analyses in a rigid fashion.
A common basis for the analyses can function as a template for constructing new program analyses that build on abstract summaries of computation path effects.
Because the arising algebraic structures are well-known, we may be able to apply classical algebraic results to give insight into properties of the analyses.
New results that apply to the general domain may be shared by all of the analyses.

Abstraction of programs manipulating pointers using modal logics

Yoshinori Tanabe (Hagiya Lab, U.Tokyo & AIST)

Predicate abstraction is a powerful framework in software model checking. We propose a method to verify specifications of programs which manipulates pointers based on predicate abstraction. It uses formulas in a modal logic, two-way CTL with nominals (2CTLN) in the case of this talk, for predicates. The "points-to" relations are regarded as modalities and program variables as nominals. Two important steps in predicate abstraction are to find the weakest precondition of a predicate for a statement of a program and to decide whether a formula is satisfiable. The logic 2CTLN is suitable because it is closed under taking the weakest precondition and its satisfiability problem is decidable. We will also report an experimental implementation and discuss a few ideas for improvements.

Joint work with Yoshifumi Yuasa, Toshifusa Sekizawa and Koichi Takahashi (AIST).

Inferring bidirectionality of programs

Johan Gade (DIKU/IST)

There are many occasions where the need of two-way or bidirectional transformations arise: synchronizing data between data sources, undoing and redoing operations in an editor and transforming data between different representations (e.g., views in a database or compression and decompression of data), just to mention a few.

In this talk we will describe the progress made towards making a system that answers the following question: Given a program, is it always the case that the results obtained from evaluating it can be modified in such a way that new inputs, corresponding (in a meaningful way) to the modified results, can be automatically obtained?

Joint work in progress with Zhenjiang Hu and Keisuke Nakano.

Non-Equilibrium Green's Function Calculation for Electron Transport

Stig Skelboe (DIKU)

In mid 2005 a research project including Department of Computer Science, University of Copenhagen, Informatics and Mathematical Modelling, Technical University of Denmark and Atomistix was initiated. The aim of the project is to develop and implement scalable parallel algorithms for nano science simulations. We take the Atomistix Tool Kit (ATK) as a starting point.

This presentation outlines the basic principles of ATK, emphasizes the computationally most demandig parts and discusses the possibilities for parallelization and vectorization.

In the current approach, the main computational cost is in the evaluation of the discrete Hamiltonian, in the solution of the Poisson equation as part of the computation of the effective potential and finally in the inversion of the matrices required to compute the relevant part of Green's function.

There are other demanding computations involved, and in order to assure scalability, as little as possible should be left sequential. With the research project being in a very early phase, most ideas are preliminary and untried.

Statistical State Estimation

Søren Hauberg (DIKU)

When a system has to interact with the real world in a safe manner it often needs to determine the state of the world. If the system depends on inexact or noisy data to determine the state, explicit methods will often fail due to noise sensitivity. Using the vagueness of statistics these difficulties can however be overcome. Traditionally the Kalman filter has been used even though it only provides little generalisation compared to explicitly computing the system state. In the last 10 years the general and very efficient particle filter has increased in popularity. The method will be described and a parallel between the method and a classic heuristic from the field of optimisation will be drawn. Examples from such diverse fields as robotics and Human Computer Interaction in computer games will be given.

Robust routing algorithm in unstable mobile network

Satoshi Kataoka (IST)

The fundamental technologies of network have recently emerged as a premier research topic. They have great long term economic potential, ability to transform our lives, and pose many new system-building challenges. Some, such as tracking, location and routing are fundamental issues, in that many applications rely on them for needed information. In this paper, we address one of the fundamental problems, namely network routing coverage problem. In the mobile network the calculation power and the transmission speed of the device put on each node are weak. Therefore, a network system must be efficient at communication cost and calculation cost. Moreover, since the mobile network is very unstable, a robust communication algorithm is required of a network system. By combining computational geometry and graph theoretic techniques, specifically the Voronoi diagram and graph search algorithm, we establish the main highlight of the paper - robust mobile network routing.

Logiweb

Klaus Grue (DIKU)

Logiweb is a system for construction, checking, and distribution of mathematical definitions, lemmas, and proofs as well as computer programs. Logiweb offers the following facilities:

Availability. Logiweb is embedded in the World Wide Web. Read access to Logiweb merely requires access to a web browser which can display PDF. Logiweb tools are available under the GNU GPL. A good place to see Logiweb is at http://yoa.dk/.
Typography. Logiweb supports TeX for rendering of submitted papers. Logiweb is not tied to any particular language or any particular alphabet.
E-publishing. Individual users can publish papers on Logiweb. Logiweb also allows to establish electronic journals, but individual users may still publish independently of any journal just like anybody can put anything on the World Wide Web.
Verification. Logiweb verifies the formal correctness of submitted papers. Anyone can repeat the verification any number of times.
Referencing. Logiweb supports formal references from one Logiweb paper to another. A formal reference makes all formal definitions in a referenced paper available in the referencing paper. As an example, this allows a proof in one paper to reference a lemma proved on a another paper.
Immutability. Once published papers are immutable. Even the author of a paper cannot change as much as a comma in a published paper. If an author wants e.g. to improve a definition, the author may publish a new version of a paper, but the old version remains for the sake of other authors who may have produced results based on the old definition.
Durability. Any site that runs a Logiweb server can mirror any Logiweb document. This makes the continued existence of a paper independent of one, particular site. Each Logiweb document gets a unique Logiweb reference which is independent of the physical location of the document. Logiweb servers automatically organize themselves in a world wide hash table which can translate Logiweb references to http urls and which keeps track of copying, deletion, and moving of Logiweb documents.
Notational freedom. Logiweb supports notational freedom. This allows authors that rely on referenced Logiweb documents to use their own notation for the concepts defined by others. This is important e.g. when one author references two Logiweb papers which use distinct notation. Furthermore, notational freedom allows papers to be written in any language using any alphabet.
Foundational freedom. Logiweb itself favours no particular proof system. Logiweb comes with a proof system (Logiweb sequent calculus). That proof system is itself published on a Logiweb page, and any user is free to program his or her own proof system.
Interoperability. Logiweb can interface to tools that can produce or understand Lisp S-expressions or similar.

Towards Consistent Agent Interactions

Jose Ghislain Quenum (IST)

Agent interactions which are inspired from generic protocols require a special approach for modelling, designing and execution. They raise up issues related to their reusability, configuration and instantiation. Agent-oriented design methodologies do not separate protocols from the private part of agent architecture. Thus none of these parts (public and private) are reusable. Furthermore, these methodologies do not provide any guidelines to configure interactions from generic protocols. Hence, inconsistencies are hard to anticipate and solve. In this paper, we propose a method to address these issues by automatically deriving agents coordination mechanisms from generic protocols. This method decouples the protocols from the remainder of agent architecture. It also dramatically reduces inconsistencies during interactions.

A Novel Approach for Applying Model Checking on Networked Applications

Cyrille Artho (IST)

Software model checkers can be applied directly to single-process programs, which typically are multi-threaded. Multi-process applications cannot be model checked directly. While multiple processes can be merged manually into a single one, this process is very labor-intensive and a major obstacle towards model checking of client-server applications.

Previous work has automated the merging of multiple applications but mostly omitted network communication. Remote procedure calls were simply inlined, creating similar results for simple cases while removing much of the inherent complexities involved. Our goal is a fully transparent replacement of network communication. Other language features were also modeled more precisely than in previous work, resulting in a program that is much closer to the original. This makes our approach suitable for testing, debugging, and software model checking. Due to the increased faithfulness of our approach, we can treat a much larger range of applications than before.

Interprocedural Program Analysis for Java based on Weighted Pushdown Model Checking

Mizuhito Ogawa (JAIST)

Based on the observation that program analysis is abstraction plus model checking, this paper investigates pushdown model checking based approach on interprocedural program analyses for mono-thread Java. The running example is an interprocedual dead code detection under PER (partial equivalence relation) based abstraction. The prototype implementation combines SOOT as preprocessing to convert Java to Jimple and the Weighted PDS (pushdown system) library as the back-end model checking engine. With these existing tools, we developed an interprocedural dead code analyzer for mono-thread Java with around 1500 lines of Java codes. This analysis framework enables us a rapid prototyping for an interprocedual analysis design.

Functional programming of sorting multiset discriminators

Fritz Henglein (DIKU)

Multiset discrimination is a collection of algorithmic techniques for efficiently partitioning data into equivalence classes according to a given equivalence relation. Previously we identified the notion of discriminator as the key generalization of partitioning that enables their inductive defintion on a language for specifying equivalence relations.

By employing discriminators for base types such characters and integer segments that sort their inputs, it can be shown that the inductive construction yields discriminators that both partition and sort their input in linear time for a wide range of total preorder. We show how such sorting discriminators can be coded up elegantly by employing Generalized Algebraic Data Types (GADTs) and list comprehensions. We illustrate this coding using Haskell (which has both GADTs and list comprehensions) and gives examples of applications of the use of sorting discriminators.

Joint work in progress with Ralf Hinze.

Hash-free binary decision diagrams

Morten Ib Nielsen (DIKU)

Reduced ordered binary decision diagrams (ROBDD or simply BDD) are canonical representations of Boolean functions f (x1, ..., xn) = E[x1, ..., xn] where E is a Boolean expression constructed from true (top), false (bottom), and, or and not. They are canonical in the sense that each Boolean function is represented by a unique diagram once the order of the input variables x1, ..., xn is fixed, and thus whether two Boolean expressions denote the same function can be efficiently decided from their BDDs. BDDs may be exponentially larger than the Boolean expressions from which they constructed, but they often turn out be much smaller than conjunctive or disjunctive normal forms of Boolean expressions.

The key to BDD construction is identification of equivalent (sub-)BDDs and memoizing calls of apply_f (n1, n2), where f is a binary Boolean function for constructing a new BDD-node and n1, n2 represent BDD-nodes. For this purpose all known BDD-packages appear to employ value numbering (hashed consing) and memoization using two auxiliary hash tables; one is to ensure that each node in a BDD is constructed only once and the other to catch repeated calls to apply with the same arguments. The use of hashing requires representing nodes as indices of an array, which contains the contents (variable and children) of the node. This is an efficient representation choice, but requires low-level programming; in particular, a garbage collector for removing nodes that are no longer in use in an application using the BDD-package must be hand-coded as part of the BDD package to free memory space.

We demonstrate that BDD-construction can be based on multiset discrimination (MSD) techniques using a priority queue based strategy for scheduling calls to the apply function. No auxiliary hash tables for value numbering are required, and garbage collection can be faciltated by the garbage collector of the implementation language; that is, without application program code. We report on preliminary experience with implementing MSD-based (hash-free) BDDs in comparision to hashing-based BDDs, the purpose of which is to identify trade-offs between MSD and hashing and to gain insight into employing one or the other or both in related problems such as equivalence testing for other algebras, e.g. Heyting algebras, or for circular (noncombinatorial) circuits. (We do not expect to MSD-based BDDs to be outright competitive with hashing-based BDDs.)

Joint work in progress with Fritz Henglein.

Aspect Mining using Structural Program Properties

Jan Hannemann and Hidehiko Masuhara (Univ. of Tokyo)

Certain software features, such as security, memory management, and logging, are inherently difficult to modularize using traditional program decompositions (such as classes and methods). Implementations of such /concerns/ tend to crosscut the module structure of software systems, and pose a problem for developers who want to understand or maintain them, as they result in scattered and tangled code. Aspect-oriented programming (AOP) is a relatively new programming paradigm that offers explicit program structures, called /aspects/, for the modularization of such crosscutting concerns (CCCs). Developers who would like to adopt AOP to improve the modularity of their legacy OO systems, face the challenge of identifying non-modularized CCCs in their code (/aspect mining/) before they can refactor them into aspects.

Our work focuses on using a structural concern model to identify non-modularized CCCs in software systems. In this aspect mining approach, the structural properties of the modeled concerns are compared to those of the target program’s abstract syntax tree. Potential matches can then be used to prompt the suggestion of appropriate refactorings into aspects. In particular, we plan to utilize the mining results in our previously developed semi-automated refactoring tool.

A Method for Inversion of a Higher Order Programming Language

Jesper Andersen (DIKU)

Program inversion is a whole-program transformation transforming a source program ps into an inverse program pi such that ps (x) = y iff pi (y) = x.

In [1] and [2] a method for program inversion of a first order functional language is developed. In this work, we present an extension of this work to a higher order functional language similar to ML.

The method is based on defunctionalizing the source program to obtain an equivalent rst order program, which can then be inverted using known techniques.

In the context of inverting higher order programs, some of the restrictions on the source programs outlined in [1] and [2] have to be relaxed. In particular, the requirement that no value may be discarded, renders the high order map-function impossible to define unless we are able to decide equality of closures at run-time. Instead we choose to allow discarding of values under certain conditions.

We present work in progress on the development of a method for program inversion of a higher order language with explicit duplication and discarding of values.

In the talk we will invert a seemingly innocent program using the higher- order map function. This example illustrates some of the issues that one has to deal with when going from inversion of first order programs to inversion of higher order programs.

[1] Robert Glück and Masahiko Kawabe. A program inverter for a functional language with equality and constructors. In Atsushi Ohori, editor, Programming Languages and Systems. Proceedings, volume 2895 of Lecture Notes in Computer Science, pages 246 264. Springer-Verlag, 2003.

[2] Robert Glück and Masahiko Kawabe. Derivation of Deterministic Inverse programs Based on LR Parsing. In Functional and Logic Programming. Proceedings, volume 2998 of Lecture Notes in Computer Science, pages 291 306. Springer-Verlag, 2004.

A Parallelization Tool for Tree Reductions

Kimonori Matsuzaki and Noriyuki Ohkawa (IST)

In this talk we demonstrate a parallelization tool for deriving efficient parallel tree programs. There have been several studies and tools for deriving parallel programs for lists and arrays, but these methods were not applicable to the tree programs. We have implemented a parallelization tool for a set of tree programs called tree reductions. In theory, our system is based on the properties of commutative semirings, and thus can deal with several tree reductions including tree dynamic programmings. In practice, our system has an optimization mechanism that reduces the cost of parallel computations, and thus generates efficient parallel tree programs.

Weak Inversion for Parallelization

Akimasa Morihata (IST)

The third homomorphism theorem says that if two sequential programs are given in terms of foldl and foldr respectively, then there must exist a parallel program in terms of list homomorphism. While the existence of a parallel program has been proved, it is yet known how it is constructed. In this talk, we introduce the idea of weak inversion, and show how it can be applied to systematically derive parallel programs from sequential ones.

Joint work with Zhenjiang Hu.

Writing practical memory management code with a strictly typed assembly language

Toshiyuki Maeda and Akinori Yonezawa (IST)

Memory management (e.g., malloc/free) cannot be implemented in traditional strictly typed programming languages because they do not allow programmers to reuse memory regions, in order to preserve memory safety. Therefore, they depend on external memory management facilities, such as garbage collection. Thus, many important programs that require explicit memory management (e.g., operating systems) have been written in weakly typed programming languages (e.g., C). To address the problem, we designed a new strictly and statically typed assembly language which is flexible and expressive enough to implement memory management. The key idea of our typed assembly language is that it supports variable-length arrays (the arrays whose size is not known until runtime) as language primitives, maintains integer constraints between variables, and keeps track of pointer aliases. Based on the idea, we implemented a prototype implementation of the language. We also implemented a small operating system kernel which provides a memory management facility with the language.

Static Types for Scripting

Torben Mogensen (DIKU)

Most scripting languages are dynamically typed, both to save the programmer from writing verbose typing declaratings and to allow heterogeneous collections and other constructions that are troublesome for typical static type systems.

But by being dynamically typed, these languages lose the advantages of static typing, most importantly static type-error detection.

Type inference, like dynamic types, can save the programmer from writing typing declarations, and still retain the advantages of static typing, but typically restricts the type system so heterogeneous collections (without explicit sum types) and reflection aren't supported.

We will look at a static type system that addresses some of these issues: We allow heterogeneous collections by introducing implicit untagged sum types and we allow record/object-like structures through unordered product types.

Waitomo: Web-Programming with Objects and Interfaces

Peter Thiemann (Universität Freiburg)

Waitomo is an experimental Java-based programming language for exploring new, robust paradigms for programming the web. Robustness results mainly from type-safety which is imposed on all, internal and string-based, interfaces. Waitomo achieves type-safety by relying on a new interpretation of Java interfaces on top of generics. Additionally, this new style of interfaces offers new approaches to object-oriented design problems.

Toward Deterministic Top-Down Parsing of LR(k) Languages

Morten Fjord-Larsen (DIKU)

LR parsers, introduced by Knuth in [2], are in theory able to efficiently parse a very large class of context-free grammars. However, the number of states in an LR(k) parser produced by Knuth’s method is often rather large. For this reason, LR parsing did not become widely used until the invention DeRemer’s LALR [1] method. LALR parsers are, however, not able to deal with all LR grammars.

LL grammars can be deterministically parsed top-down from left to right. LL parsers are widely appreciated for both their efficiency and simplicity. Im- plementing an LL(1) parser by hand is a fairy simple task. Although, most programming languages can be described by LL(k) grammars, there exists lan- guages that are LR, but not LL(k) for any k.

We present a simple extension of context-free grammars which we call pair matching grammars. We generalize the notion of LL(k) grammars to pair match- ing grammars and examine transformations of LR(k) grammars into pair match- ing LL(k) grammars. We then explore how top-down parsing methods of context-free grammars can be extended to pair matching grammars. Our work- ing hypothesis is that parsers produced in this way will be competitive with LALR parsers in terms of size and speed while retaining the power of LR parsers.

[1] DeRemer , F. L. Practical translators for LR(k) languages. PhD thesis, Dep. Electrical Engineering, Massachusetts Institute of Technology, Cambridge, 1969.

[2] Knuth, D. E. On the translation of languages from left to right. Information and Control 8, 6 (Dec. 1965), 607–639.

The density of words in formal languages

Jakob Grue Simonsen (DIKU)

Given a language L over some alphabet, how large a fraction of the total number of available words of a given length n is contained in L? And can we say anything about the fraction as n tends to infinity?

The answers to these questions have surprising connections to many areas of computer science, e.g. the "average" fraction above corresponds to the optimal compression rate obtainable for sufficiently well-behaved languages.

In this talk, we will discuss some old and new ways to find such answers; we will show when they can definitely be found fast, when they definitely cannot be found in reasonable time, and when they definitely cannot be found at all.

Shuffle Expressions for Static XML Type-checking

Tadahiro Suda (IST)

In some XML schema languages (Relax-NG, W3C-XML-Schema, and so on), we can use shuffle expressions, which allow elements to occur in any order, added to regular expressions. By shuffle expressions, schemas of bibtex-like data can be expressed simple. The main theme of our study is to handle shuffle expressions in static type-checking for XML processing. Unfortunately, since equivalent transformation of shuffle expression to regular expression easily blows up, it is hard to take them straightforward in existing XML type-checkings based on regular expressions. Thus we propose a new technique using Ternary Decision Diagrams (TDDs), extension of known BDDs, in which each nodes have three decision way. In our technique, TDDs stand for shuffle expressions and, by operations of TDDs, we realize containment-check and type-inference of shuffle expressions. Also, for checking containment between shuffle expressions and regular expressions, we first compute the least shuffle expression including the regular expression and then reduce containment-check between shuffle expressions, that is, operations of TDDs.

XML Transformation Language Based on Monadic Second Order Logic

Kazuhiro Inaba (IST)

Although monadic second-order logic (MSO) has been a foundation of XML queries, little work has attempted to take MSO formulae themselves as a programming construct. Indeed, MSO formulae are capable of expressing (1) all regular queries, (2) deep matching without explicit recursion, (3) queries in a don't-care semantics for unmentioned nodes and (4) n-ary queries for locating n-tuples of nodes. While previous frameworks for subtree extraction each had some of these properties, none has satisfied all of them.

We have designed and implemented a practical XML transformation language called MTran that fully exploits the expressiveness of MSO. For the implementation of the MTran language, we also have developed, as the core part, an efficient evaluation strategy for n-ary MSO queries by using partially lazy evaluation of set operations.