Proving Termination with Non-monotone Interpretations
Polynomial interpretations are one of the most powerful methods for proving termination of term rewrite systems. So far this method has been considered sound only if interpretations are monotone functions. In this talk I explain how this monotonicity restriction can be dropped.
A language-based approach to optimal path problems
Optimal path problems are problems in which the objective is to find the optimal path, which has the lowest weight subject to the requirement. Optimal path problems are quite important, and have many applications in transportation networks, scheduling, QoS guaranteeing, and extracting data from databases. Although many algorithms to solve their variants were proposed, they are still difficult problem for non-specialists, who hardly follow the existing results.
In this talk, I will present an ongoing effort to construct a language-based system to solve optimal path problems. Once a user gives a specification for an optimal path problem, namely the requirement of feasible paths and the weight of a path, then the system hopefully generate an efficient algorithm to find the optimal path. I will show that we can automatically derive such algorithms with some assumptions; moreover, an optimization enables us to obtain known efficient algorithms for some known classes of problems.
Fusion Transformation for Non-strict functions
Some program transformation assume that data types are initial algebras. Although data types are initial algebras in the category of CPO and strict continuous functions, in general they aren't initial algebras in the category of CPO and arbitrary continuous functions. Therefore we aim to achieve transformations applicable to all continuous functions (not restricted to strict ones). We show a condition which guarantees data types are indeed initial algebras. And we apply the condition to Haskell's data types.
Reliable Parallel Computing with BSML
As parallel architectures are becoming the norm in computer design, language designers are still looking for the best expression of corresponding programs. The old standards of high performance computing no longer match the needs for reliability and abstraction of the wider range of applications. Bulk Synchronous Parallel ML is set at a lower abstraction level than algorithmic skeletons, thus remains a generalistic language, yet it retains the safety and ease of use of modern functional programming languages, with a strong type system.
In this talk, we will discuss BSML and its applications, and come to details with some important points of the language like exception handling and the parallel type system.
Parallel algorithms and applications: the declarative programming perspective
We will describe a research trend that is as old as Computer Science itself, namely the combination of parallel algorithms to improve performance with high-level programming to improve safety and expressive power. Complexity theory teaches us that only a subset (called NC) of feasible computational problems (called P) may have exponential parallel speedup. Some useful problems like syntactic unification and circuit evaluation are as difficult to make "instantaneous" as NP-hard problems are to make tractable.
That fundamental limitation undermined the 5th generation project in its quest for unlimited computational power. Nevertheless the NC family of problems covers a vast family of applications for which speedups are only limited by internal network performance. This performance is modeled by the Bulk-synchronous Parallelism (BSP) model of Valiant and McColl. BSP algorithms explain the potentially unlimited speedups by a balance between the number of processors involved and the bandwith+latency of internal communication. We will survey typical situations whereby the BSP analysis summarizes the key tradeoffs in practical parallel computing.
But evan concrete parallel algorithms well-tuned to application and target architecture have one major drawback: they have an explicit "memory hierarchy" just like out-of-core algorithms. The very reason for their existence, efficiency, thus makes them less abstract and less portable than classical "in-core" algorithms based on the uniform memory access hypothesis. As a result, parallel programming is much more difficult to teach, learn, modularize etc.
Parallel software should become widespread and take advantage of the new dozens, hundreds and then thousands of processors/cores in a single computer. But for this to happen, developers will need more abstraction than explicit message passing, yet stay in control of the essential features like number of processors and bandwidth-latency. The BSP model provides one useful element of abstraction: synchronization barriers which regulate the asynchrony of processes and largely simplify the semantics of parallel programs by comparison with arbitrary communicating processes (determinism, absence of deadlock). More ambitious research of the last 20 years has also been building more elements of abstraction for the same parallel, portable and scalable algorithms. We will summarize some of their realizations and their ongoing efforts to provide declarative, composable, abstract programs that maintain just enough performance-related infomation to ensure parallel scalability.
Parallel programming on Cell Broadband Engine
Cell Broadband Engine is a heterogeneous multi-core processor made by Sony, Toshiba, and IBM. While it is well known as the engine of Sony PlayStation 3, the processor design aims at much broader applications including media processing, high-performance computing, and distributed parallel computing. The processor consists of one general purpose processor core and eight SIMD processor cores with their dedicated memory spaces, connected with each other via a high-bandwidth bus. Due to the multi-core design, parallel programming is crucial to exploit the best performance on the processor. This talk will introduce the Cell Broadband Engine architecture and show the design points specific to the processor, especially focusing on the parallelism.
Domain-Specific Optimization for Skeleton Programs Involving Neighbor Elements
Skeletal parallel programming enables us to develop parallel programs easily by composing ready-made components called skeletons. However, a simply-composed skeleton program often lacks efficiency due to overheads of intermediate data structures and communications. Many studies have addressed on optimizations by fusing successive skeletons, but existing fusion transformations are too general to achieve adequate efficiency for some specific classes of problems. Thus, a fusion optimization specific to the target problems is needed.
In this presentation, we take up problems involving neighbor elements, which are often seen in scientific computation, and propose a specific optimization of skeleton programs solving the target problems. We start with a normal form that abstracts the programs of interest. Then, we develop fusion rules that transform a skeleton program into the normal form. Finally, we make efficient parallel implementation of the normal form. This specific optimization makes the skeleton programs considerably efficient.
MSO Logic and Tree Transducers with Decidable Equivalence (joint work with Joost Engelfriet and Helmut Seidl)
This talks shows how old results from automata and formal language theory can be used to solve problems on XML transformations which are of current interest.
Tree transducers are formal models to describe translations on labeled, ordered trees. They originate from the early days in compilers, but have gained applicability through recent interest in semi-structured data on the web, and XML in particular. Even though tree transducers are fairly old and mature models, not much is known about the decidability of their equivalence problem, i.e., the question whether two given transducers realize the same (tree) translation. The problem is undecidable for non-deterministic transducers, even already in case of word transducers. In this talk I want to discuss two recent advances on the equivalence problem for tree transducers.
(1) Equivalence is decidable for deterministic MSO Logic definable tree transducers. Using MSO logic to describe tree translations gives rise to a particular natural and robust class of translations with many good properties (such as, for instance, closure under sequential composition). The proof of the decidability is based on the fact that images of context-free graph languages under MSO translations are Parikh (i.e., their Parikh vectors form a semi-linear set) and hence constitutes a nice application of old formal language theory results to problems that are of current interest in semistructured/XML databases.
(2) Equivalence of deterministic top-down tree transducers can be decided in polynomial time, if the transducers are in "earliest normalform". Many useful XML transformations can be formulated through deterministic top-down tree (DT) transducers. It has been known for a long time (Zoltan Esik 1981) that equivalence of DT transducers is decidable. Nothing however has been known about the complexity of this problem. The "earliest" property constitutes a canonical form for DT transducers in which each transducer emits any output symbol as early in a translation as possible. If a DT transducer is total then an equivalent earliest transducer can be obtained in polynomial time too. We expect that the concept of earliest transducers has more applications, for instance, to the problem of streaming XML transformations.
References:
Engelfriet, J. and Maneth, S. The Equivalence Problem for Deterministic MSO Tree Transducers is Decidable, Inform. Proc. Letters 100:206-212 (2006).
Maneth, S. and Seidl, H. Deciding Equivalence of Top-Down XML Transformations in Polynomial Time, To appear in Proc. PLAN-X 2007.