Within the Software Science Department, we hold regular talks which are listed here. If you are interested in receiving announcements, which publishes the online meeting URL, then subscribe yourself to the icis-sws-seminar mailinglist.
We organise also some SWS social events.
Nils Jansen (Radboud University, NL)
Time: 12:15
Date: 28.02.2023
Room: HFML0220
Title: Intelligent and Dependable Decision-Making Under Uncertainty
Abstract: This talk highlights our vision of foundational and application-driven research toward safety and dependability in artificial intelligence (AI). We take a broad stance on AI that combines formal methods, machine learning, and control theory. As part of this research line, we study problems inspired by autonomous systems, planning in robotics, and industrial applications. We consider reinforcement learning (RL) as a specific machine learning technique for decision-making under uncertainty. RL generally learns to behave optimally via trial and error. Consequently, and despite its massive success in the past years, RL lacks mechanisms to ensure safe and correct behavior. Formal methods, in particular formal verification, is a research area that provides formal guarantees of a system’s correctness and safety based on rigorous methods and precise specifications. Yet, fundamental challenges have obstructed the effective application of verification to reinforcement learning. Our main objective is to devise novel, data-driven verification methods that tightly integrate with RL. In particular, we develop techniques that address real-world challenges to the safety of AI systems in general: Scalability, expressiveness, and robustness against the uncertainty that occurs when operating in the real world. The overall goal is to advance the real-world deployment of reinforcement learning.
Dennis Groß (Radboud University, NL)
Time: 12:15
Date: 14.02.2023
Room: HFML0220
Title: Before things go wrong
Abstract: The inadequate safety and security of reinforcement learning (RL) highlights the requirement for a tool that confirms the reliability of trained RL agents. COOL-MC is such a tool and combines RL with model checking. In this SWS talk, I will give you an overview of COOL-MC extensions that became the subject of several publications. The focus of this talk will be on the verification of multi-agent reinforcement learning systems and the assessment of security risks through adversarial attackers.
Marnix Suilen (Radboud University, NL)
Time: 12:15
Date: 31.01.2023
Room: HFML0220
Title: Safe Policy Improvement for POMDPs via Finite-State Controllers
Abstract: We study safe policy improvement (SPI) for partially observable Markov decision processes (POMDPs). SPI is an offline reinforcement learning (RL) problem that assumes access to (1) historical data about an environment, and (2) the so-called behavior policy that previously generated this data by interacting with the environment. SPI methods neither require access to a model nor the environment itself, and aim to reliably improve the behavior policy in an offline manner. Existing methods make the strong assumption that the environment is fully observable. In our novel approach to the SPI problem for POMDPs, we assume that a finite-state controller (FSC) represents the behavior policy and that finite memory is sufficient to derive optimal policies. This assumption allows us to map the POMDP to a finite-state fully observable MDP, the history MDP. We estimate this MDP by combining the historical data and the memory of the FSC, and compute an improved policy using an off-the-shelf SPI algorithm. The underlying SPI method constrains the policy-space according to the available data, such that the newly computed policy only differs from the behavior policy when sufficient data was available. We show that this new policy, converted into a new FSC for the (unknown) POMDP, outperforms the behavior policy with high probability. Experimental results on several well-established benchmarks show the applicability of the approach, even in cases where finite memory is not sufficient.
Ruben Turkenburg (Radboud University, NL)
Time: 12:15
Date: 17.01.2023
Room: HG02.052
Title: Preservation and Reflection of Bisimilarity via Invertible Steps
Abstract: In the theory of coalgebras, distributive laws give a general perspective on determinisation and other automata constructions. This perspective has recently been extended to include so-called weak distributive laws, covering several constructions on state-based systems that are not captured by regular distributive laws, such as the construction of a belief-state transformer from a probabilistic automaton, and ultrafilter extensions of Kripke frames. In this talk, we focus on the effect of such constructions on the behaviour of the involved systems, which we study here using bisimilarity on coalgebras. We start by recalling the constructions above, before showing how these (and other) examples fit into the setting of what we can an invertible step; a more general notion than weak distributive laws. Our main result is then that part of the construction induced by such an invertible step preserves and reflects bisimilarity. This covers results previously shown by hand for the instances of ultrafilter extensions and belief-state transformers.
Niels van der Weide and Deivid Vale (Radboud University, NL)
Time: 12:15
Date: 20.12.2022
Room: HFML0220
Title: Certifying Higher-Order Polynomial Interpretations
Abstract: Higher order rewriting is a framework in which one can write down higher order programs, and study properties about them. Among those properties is termination, which says that no matter how you execute your program, it always gives a result. Several tools have been developed to check whether higher order rewriting systems are actually terminating. However, developing such tools is difficult and can be error-prone. In this paper, we present a way of certifying termination proofs of higher order rewriting systems. We formalize a specific method, namely the polynomial method, that is used to prove termination. In addition, we give a program that turns the output of Wanda, which is a tool to prove the termination of higher order rewriting systems, into a Coq script, so that we can check whether the output is a valid proof of termination.
Ike Mulder (Radboud University, NL)
Time: 12:15
Date: 13.12.2022
Room: HFML0220
Title: Diaframe: Proof Automation for the Verification of Concurrent Programs
Abstract: Concurrent programs play an increasingly important role in modern systems, yet are notoriously hard to verify. Existing verification tools either provide machine-checked proofs, or good automation, but not both. We present Diaframe, a proof automation library for the Iris concurrent separation logic framework in the Coq proof assistant. A benchmark of 24 examples shows that Diaframe's automation is competitive with existing automated tools, yet provides stronger guarantees. In this talk, I will start with a short introduction to separation logic. I will try motivate its use for program verification, and show the challenges that arise in the verification of concurrent programs. Finally, I will give some intuition for how Diaframe overcomes these problems.
Marck van der Vegt (Radboud University, NL)
Time: 12:15
Date: 29.11.2022
Room: HFML0220
Title: Robust Almost-Sure Reachability in Multi-Environment MDPs
Abstract: This talk is about a paper I have been working on with Sebastian Junges and Nils Jansen. The paper's abstract is below: Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that share the states but differ in the transition dynamics. These models form a proper subclass of partially observable MDPs (POMDPs). We consider the synthesis of policies that robustly satisfy an almost-sure reachability property in MEMDPs, that is, *one* policy that satisfies a property *for all* environments. For POMDPs, deciding the existence of robust policies is an EXPTIME-complete problem. In this paper, we show that this problem is PSPACE-complete for MEMDPs, while the policies in general require exponential memory. We exploit the theoretical results to develop and implement an algorithm that shows promising results in synthesizing robust policies for a large set of benchmark environments.
Herman Geuvers (Radboud University, NL)
Time: 12:15
Date: 22.11.2022
Room: HFML0220
Title: "Apartness and Hennessy-Milner logic" (joint work with Anton Golov)
Abstract: Apartness is the opposite (dual) of bisimulation. Intuitively, two states in a system are apart if there is a positive way to distinguish them. Apartness is an inductive notion, so we have a deduction system for proving that two states are apart, and if we cannot prove they are apart, they are bismilar. This works for various notions of bisimilarity, especially for those where the systems can be described as co-algebras. So apartness gives an inductive view on the co-inductive notion of bismimulation.
There is also a logical view on bisimulation: two states are bisimilar if-and-only-if they satisfy the same formulas of Hennessy-Milner logic (HML), where the precise syntax for HML formulas and the notion of satisfaction depends on the type of bisimulation one wants to talk about.
In the talk we will focus on this "if-and-only-if" in its dual form, using the inductive nature of apartness. We will prove (directly, without referring to bisimulation):
Two states are apart if-and-only-if there is a Hennessy-Milner logic formula that distinguishes them.
The "only if" is proved by constructing the HML formula by induction on the proof that two states are apart. The "if" part is proven by induction on the HML formula. We will discuss this for Labelled Transition Systems (LTS) with strong bisimulation, weak bisimulation and branching bisimulation. For branching bisimulation, the proof is remarkably tricky and we propose a slightly different variant of HML, PHMLU, (Positive Hennessy-Milner Logic with Until) for branching bisimulation that simplifies the proof. We believe that PHMLU has merit of its own, as being the "natural logic" for branching bisimulation/apartness.
Patrick van Bommel (Radboud University, NL)
Time: 12:15
Date: 15.11.2022
Room: HFML0220
Title: Software correctness levels in automatic grading of software exercises
Abstract: In this talk, we will discuss the initial thoughts behind our project on automatic grading of software exercises. As this project has recently started, we will focus on initial ideas. We discuss basic examples and some underlying questions. It is our intention to organize the automatic grading around the notion of correctness levels. Furthermore, we intend to have a language-specific part and a language-independent part. These concepts will be roughly explained, and the next steps in the project will be identified.
Marc Hermes (Radboud University, NL)
Time: 12:15
Date: 01.11.2022
Room: HFML0220
Title: Computability without Tears
Abstract: Computability Theory usually starts with a definition - say, of Turing machines - to precisely determine the notion of computable functions. Since procedures then need to be represented by explicit Turing machines, which tends to be cumbersome, many presentations resort to informal arguments when it comes to their constructions. While this is perfectly fine on paper, it turns into a challenge when attempting to mechanize and verify computability theory in a proof assistant. In this talk I want to present an approach to Computability Theory coined “Synthetic Computability”, whose foundation was layed by Richman's 1983 paper “Church's Thesis without Tears”. The approach allows for a model-free treatment of textbook computability results [1,2], and is well suited for mechanization and as a framework for studying undecidability results [3,4]. Lastly, I will speak about my MA Thesis which was situated in this line of research.
[1] Bauer 2006 First steps in synthetic computability theory
[2] Bauer 2017 On fixed-point theorems in synthetic computability theory
[3] Forster, Larchey-Wendling 2019 A constructive Coq library for the
mechanization of undecidability
[4] Forster et al. 2020 A Coq Library of Undecidable Problems
Dennis Gross (Radboud University, NL)
Time: 12:15
Date: 25.10.2022
Room: HG02.052
Title: COOL-MC: A Comprehensive Tool for Reinforcement Learning and Model Checking
Abstract: This paper presents COOL-MC, a tool that integrates state-of-the-art reinforcement learning (RL) and model checking. Specifically, the tool builds upon the OpenAI gym and the probabilistic model checker Storm. COOL-MC provides the following features: (1) a simulator to train RL policies in the OpenAI gym for Markov decision processes (MDPs) that are defined as input for Storm, (2) a new model builder for Storm, which uses callback functions to verify (neural network) RL policies, (3) formal abstractions that relate models and policies specified in OpenAI gym or Storm, and (4) algorithms to obtain bounds on the performance of so-called permissive policies. We describe the components and architecture of COOL-MC and demonstrate its features on multiple benchmark environments.
Jules Jacobs (Radboud University, NL)
Time: 12:15
Date: 25.10.2022
Room: HG02.052
Title: Fast Coalgebraic Bisimilarity Minimization
Abstract: (This is joint work with Thorsten Wissmann) Coalgebraic partition refinement generalizes classical automaton minimization to a general class of system types whose transition structure is specified by a functor, subsuming strong, weighted, and probabilistic bisimilarity. There are two existing algorithms for this problem: one algorithm is asymptotically efficient but applies to a restricted class of functors, and the other algorithm applies to all functors but is asymptotically inefficient. We present an algorithm that applies to all functors, and is asymptotically between the two existing algorithms. We have implemented our algorithm and show that it uses less time and memory than the existing algorithms in practice..
Carsten Trinitis (TU Munich, DE)
Time: 12:15
Date: 21.10.2022
Room: TBA
Title: HPC related activities at TU Munich (Heilbronn) - an Overview
Abstract: The talk will give an overview on High Performance Computing (HPC) related activities at TU Munich. Special emphasis will be put on activities at TUM new Information Engineering site in Heilbronn, where first pilot projects have been initiated. Topics will span Embedded Systems and Computer Architecture, HW/SW co-design, Quantum Computing as well as compute intensive applications. For the latter, a project on electrostatic field simulation will briefly be introduced. This project is carried out as a pilot project at TUM Heilbronn, aiming at enabling sophisticated three-dimensional CAD design of high voltage components. Using entirely non-commercial open source software it can also be used as a practical compute intensive application in classes.
Ahmadreza Marandi (Eindhoven University of Technology, NL)
Time: 12:15
Date: 20.09.2022
Room: HG02.052
Title: Robust location-transportation problems with integer-valued demand
Abstract: A Location-Transportation (LT) problem concerns designing a company’s distribution network consisting of one central warehouse with ample stock and multiple local warehouses for a long but finite time horizon. The network is designed to satisfy the demands of geographically dispersed customers for multiple items within given delivery time targets. The company needs to decide on the locations of local warehouses and their basestock levels while considering the optimal shipment policies from central or local warehouses to customers. In this talk, we deal with integer uncertain demands in LT problems to design a robust distribution network. We prove two main characteristics of our LT problems, namely convexity and nondecreasingness of the optimal shipment cost function. Using these characteristics, we show for two commonly used uncertainty sets (box and budget uncertainty sets) that the optimal decisions on the location and the basestock levels of local warehouses can be made by solving a polynomial number of deterministic problems. For a general uncertainty set, we propose a new method, called Simplex-type method, to find a locally robust solution. The numerical experiments show the superiority of our method over using the integer-valued affine decision rules, which is the only available method for this class of problems.
This talk is based on this paper: Marandi, A., & van Houtum, G. J. (working paper). Robust location-transportation problems with integer-valued demand. Optimization Online.
Patrick Wienhöft (TU Dresden, DE)
Time: 12:15
Date: 20.09.2022
Room: HG02.052
Title: Improving sample efficiency in Statistical Model Checking
Abstract: We consider the model checking problem for partially unknown Markov Decision Processes and Stochastic Games with probably approximately correct guarantees on the result. Specifically, we focus on gray-box systems for which the topology of the underlying graph is known, but the prior knowledge about the transition probabilities may be incomplete. While recently developed statistical model checking algorithms provide the first tractable approaches to tackle this problem, they still leave room for improvement. We focus on (i) replacing conservative bounds obtained from Hoeffding's inequality by exact error probabilities, and (ii) distributing error tolerances non-uniformly rather than assigning the same error tolerance to every transition. We compare our methods to existing approaches and find that our method performs best in all examples.
Dario Stein (iHub)
Time: 12:15
Date: 30.08.2022
Room: HG02.052
Title: Tutorial on probabilistic programming
Abstract: Probabilistic programming is a powerful new paradigm which ranges from highly optimized statistics software like STAN to general-purpose Turing complete languages like Gen. In either case, the idea is to unify generative modelling and Bayesian inference as first-class primitives in the same language. This lets probabilistic programs express complex statistical models in a highly modular and accessible way.
My goal is to give a hands-on introduction to probabilistic programming using the language WebPPL, which will enable YOU to express and solve probabilistic questions. For this I assume no prior knowledge of statistics. A more technical discussion on implementation and inference algorithms may follow.
Mairieli Wessel (Radboud University, NL)
Time: 12:15
Date: 23.08.2022
Room: HG02.052
Title: Bots in software engineering: The good, the bad, the promising
Abstract: Bots are applications that react to external stimuli such as events triggered by tools or messages posted by users and run automated tasks in response, working as an interface between users and services. Bots often include conversational capabilities to interact with end-users through textual messages (in chatbots) or speech (in voicebots) in the same communication channels as their human counterparts. We are witnessing a massive adoption of bots in various domains, including e-commerce, customer service, and education. Software development is no exception. Given the essential complexity of software projects and the large community of people around them (stakeholders, designers, developers and, let's not forget, end-users), there are plenty of opportunities for bots to jump in and tame this complexity by (semi)automating repetitive tasks. This talk discusses the current role of bots in software engineering: benefits (the good), shortcomings (the bad), and the long road ahead (the promising).
Diego Damasceno (Radboud University, NL)
Time: 12:15
Date: 16.08.2022
Room: HG02.052
Title: Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Objective Problems
Abstract: In search-based software engineering (SBSE), the choice of search operators can significantly impact the quality of the obtained solutions and the efficiency of the search. Recent work in the context of combining SBSE with model-driven engineering has investigated the idea of automatically generating smart search operators for the case at hand. While showing improvements, this previous work focused on single-objective optimization, a restriction that prohibits a broader use for many SBSE scenarios. Furthermore, since it did not allow users to customize the generation, it could miss out on useful domain knowledge that may further improve the quality of the generated operators. To address these issues, we propose a customizable framework for generating mutation operators for multi-objective problems. It generates mutation operators in the form of model transformations that can modify solutions represented as instances of the given problem meta-model. To this end, we extend an existing framework to support multi-objective problems as well as customization based on domain knowledge, including the capability to specify manual "baseline" operators that are refined during the operator generation. Our evaluation based on the Next Release Problem shows that the automated generation of mutation operators and user-provided domain knowledge can improve the performance of the search without sacrificing the overall result quality.
Preprint available at <https://damascenodiego.github.io/assets/pdf/seaa2022_nielsvanharten.pdf>
Thom Badings (Radboud University, NL)
Time: 12:15
Date: 04.08.2022
Room: HG02.052
Title: Sampling-Based Verification of CTMCs with Uncertain Rates
Abstract: We employ uncertain parametric CTMCs with parametric transition rates and a prior on the parameter values. The prior encodes uncertainty about the actual transition rates, while the parameters allow dependencies between transition rates. Sampling the parameter values from the prior distribution then yields a standard CTMC, for which we may compute relevant reachability probabilities. We provide a principled solution, based on a technique called scenario-optimization, to the following problem: From a finite set of parameter samples and a user-specified confidence level, compute prediction regions on the reachability probabilities. The prediction regions should (with high probability) contain the reachability probabilities of a CTMC induced by any additional sample. To boost the scalability of the approach, we employ standard abstraction techniques and adapt our methodology to support approximate reachability probabilities. Experiments with various well-known benchmarks show the applicability of the approach. In this talk, I will elaborate on how computer science lies at the heart of many challenges in predictive maintenance, especially data sciences, information security and software: accurate failure prognostics, automation of the solutions, their specification and verification, multi-stage optimization, the complexity of IT architecture, as well as the organizational embedding of the solutions. Thus, I hope to inspire computer scientists to contribute to this relevant and exciting field.
Pierre Goutagny and Malo Jaffré (UDL, FR)
Time: 12:15
Date: 13.07.2022
Room: HFML0220
Title: Internship presentations
Abstract: (See Invite)
Frans A. Oliehoek (TU Delft, NL)
Time: 12:15
Date: 28.06.2022
Room: HG02.028
Title: Sequential decision making from the perspective of influence-based abstraction
Abstract: Reinforcement learning (RL) and more generally sequential decision making deal with problems where the decision maker ('agent') needs to take actions over time. While impressive results have been achieved on challenging domains like Atari, Go, and Starcraft, most of this work relies on neural networks to form their own internal abstractions. However, in many cases, we may be able to exploit some knowledge about the domains to guide this process.
Therefore, in this talk, I will present a more analytical approach towards abstraction. Specifically, I will cover some of the results of the ERC starting grant project 'INFLUENCE', which has further developed the framework of 'influence-based abstraction'. The idea is to describe a complex system from the perspective of a local decision problem. For instance, when trying to optimize the traffic flow at an intersection in a large city, it makes intuitive sense that we might be able to perform some local optimization, as long as we take into account how this problem is influenced over time. In my talk, I will give a formal definition of such 'influence' and discuss different ways in which we have leveraged this perspective in recent years.
Frits Vaandrager (Radboud University, NL)
Time: 12:15
Date: 21.06.2022
Room: HG02.052
Title: Active Automata Learning Using L#
Abstract: We present L#, a new and simple approach to active automata learning. Instead of focusing on equivalence of observations, like the L∗ algorithm and its descendants, L# takes a different perspective: it tries to establish apartness, a constructive form of inequality. L# does not require auxiliary notions such as observation tables or discrimination trees, but operates directly on tree-shaped automata. L# has the same asymptotic query and symbol complexities as the best existing learning algorithms, but we show that adaptive distinguishing sequences can be naturally integrated to boost the performance of L# in practice. Experiments with a prototype implementation, written in Rust, suggest that L# is competitive with existing algorithms.
Cristian Daniele (DS) (Radboud University, NL)
Time: 12:15
Date: 14.06.2022
Room: HG02.028
Title: Stateful Fuzzing: Survey and Research Directions
Abstract: Fuzzing is a great way to find security bugs in applications. Unfortunately, stateful applications are a challenge for most fuzzers, as fuzzers tend not to reach deeper states. The scientific community has responded to this challenge and developed several fuzzers specifically aimed at stateful applications in recent years, but a clear understanding of the variety of underlying strategies these fuzzers use is missing.
Marnix Suilen (Radboud University, NL)
Time: 12:15
Date: 31.05.2022
Room: HG03.085
Title: Robust Anytime Learning of Markov Decision Processes
Abstract: Markov decision processes (MDPs) are formal models commonly used in sequential decision-making. MDPs capture the stochasticity that may arise, for instance, from imprecise actuators via probabilities in the transition function. However, in data-driven applications, deriving precise probabilities from (limited) data introduces statistical errors that may lead to unexpected or undesirable outcomes. Uncertain MDPs (uMDPs) do not require precise probabilities but instead use so-called uncertainty sets in the transitions, accounting for such limited data. Tools from the formal verification community efficiently compute robust policies that provably adhere to formal specifications, like safety constraints, under the worst-case instance in the uncertainty set. We continuously learn the transition probabilities of an MDP in a robust anytime-learning approach that combines a dedicated Bayesian inference scheme with the computation of robust policies. In particular, our method (1) approximates probabilities as intervals, (2) adapts to new data that may be inconsistent with an intermediate model, and (3) may be stopped at any time to compute a robust policy on the uMDP that faithfully captures the data so far. We show the effectiveness of our approach and compare it to robust policies computed on uMDPs learned by the UCRL2 reinforcement learning algorithm in an experimental evaluation on several benchmarks.
Bas van den Heuvel (University of Groningen, NL)
Time: 12:15
Date: 24.05.2022
Room: HG02.028
Title: Deadlock-freedom for Asynchronous Communication in Cyclic Process Networks
Abstract: Establishing the deadlock-freedom property for message-passing processes is an important problem. In this talk, I present recent work on a static verification technique based on session types for concurrent processes. We address the challenging case of processes that communicate asynchronously and form cyclic process networks. I will present APCP, a typed process framework for specifying cyclic process networks that are guaranteed to be deadlock-free. I will discuss the main decisions involved in the design of APCP and its fundamental results. If time allows, I will briefly present two applications: an operationally correct translation of a concurrent λ-calculus with sessions to APCP, and an analysis of deadlock-freedom and protocol conformance for decentralized implementations of multiparty session types.
Thiago D. Simão (Radboud University, NL)
Time: 12:15
Date: 17.05.2022
Room: online
Title: Training and Transferring Safe Policies in Reinforcement Learning
Abstract: Safety is critical to broadening the application of reinforcement learning (RL). Often, RL agents are trained in a controlled environment, such as a laboratory, before being deployed in the real world. However, the target reward might be unknown prior to deployment. Reward-free RL addresses this problem by training an agent without the reward to adapt quickly once the reward is revealed. We consider the constrained reward-free setting, where an agent (the guide) learns to explore safely without the reward signal. This agent is trained in a controlled environment, which allows unsafe interactions and still provides the safety signal. After the target task is revealed, safety violations are not allowed anymore. Thus, the guide is leveraged to compose a safe sampling policy. Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence from the guide as training progresses. The empirical analysis shows that this method can achieve safe transfer learning and helps the student solve the target task faster.
Thijs Heijligenberg (Radboud University, NL)
Time: 10:30
Date: 03.03.2022
Room: online
Title: Testing mobile networks based on state machines
Abstract: Mobile networks, especially the latest generation 5G, form incredibly complex systems with critical security implications. Testing mobile networks is hard in practice as there is no machine-readable reference or base implementation, while the system contains many moving parts making it hard to distinguish exact behaviour. We discuss existing approaches to this problem and how we plan to provide a state-machine based testing framework.
Jules Jacobs (Radboud University, NL)
Time: 10:30
Date: 13.01.2022
Room: online
Title: Connectivity Graphs: A Method for Proving Deadlock Freedom Based on Separation Logic
Abstract: We introduce the notion of a connectivity graph—an abstract representation of the topology of concurrently interacting entities, which allows us to encapsulate generic principles of reasoning about deadlock freedom. Connectivity graphs are parametric in their vertices (representing entities like threads and channels) and their edges (representing references between entities) with labels (representing interaction protocols). We prove deadlock and memory leak freedom in the style of progress and preservation and use separation logic as a meta theoretic tool to treat connectivity graph edges and labels substructurally. To prove preservation locally, we distill generic separation logic rules for local graph transformations that preserve acyclicity of the connectivity graph. To prove global progress locally, we introduce a waiting induction principle for acyclic connectivity graphs. We mechanize our results in Coq, and instantiate our method with a higher-order binary session-typed language to obtain the first mechanized proof of deadlock and leak freedom.
Alex Keizer (University of Amsterdam (ILLC), NL)
Time: 10:30
Date: 25.11.2021
Room: ONLINE
Title: Session Coalgebras: A Coalgebraic View on Session Types and Communication Protocols
Abstract: For concurrent and communicating systems, compositional techniques based on behavioural type systems have received much attention. By abstracting communication protocols as types, these type systems can statically check that programs interact with channels according to a certain protocol, whether the intended messages are exchanged in a certain order. We will put on our coalgebraic spectacles to investigate session types, a widely studied class of behavioural type systems. We provide a syntax-free description of session-based concurrency as states of coalgebras. As a result, we rediscover type equivalence, duality, and subtyping relations in terms of canonical coinductive presentations. In turn, this coinductive presentation makes it possible to elegantly derive a decidable type system with subtyping for pi-calculus processes, in which the states of a coalgebra will serve as channel protocols.
Thorsten Wißmann (Radboud University, NL)
Time: 10:30
Date: 18.11.2021
Room: HG00.308
Title: Supported Sets – Where Register Automata Live
Abstract: This talk introduces the category of supported sets. They provide a uniform foundation for different kinds of nominal sets, such as those for equality, renaming, and order symmetry. We show that the differently flavoured categories of nominal sets are monadic over supported sets. Supported sets embody the basic principles of register automata and thus constitutes a canonical environment for existing set-theoretic definitions of register automata.
Roman Andriushchenko (Brno University of Technology, CZE)
Time: 10:30
Date: 11.11.2021
Room: HG00.308.
Title: PAYNT: a Tool for Inductive Synthesis of Probabilistic Programs
Abstract: This tutorial presents PAYNT, a tool for the automatic synthesis of probabilistic programs. The starting point is a program sketch: an incomplete description of a probabilistic program. Such a sketch represents a finite family of program candidates -- Markov chains with related but distinct topologies. The specification is formulated as a conjunction of temporal logic constraints and can include an optimizing objective. PAYNT employs state-of-the-art oracle-guided methods to effectively reason about all possible candidates and synthesize programs that meet the provided specification. In this tutorial, we will cover the basics of probabilistic sketching and demonstrate the usefulness of PAYNT using several case studies from different application domains. Finally, we will present modern oracle-guided synthesis methods at the heart of PAYNT.
Todd Schmid (University College London, UK)
Time: 10:30
Date: 04.11.2021
Room: HG00.086
Title: Coalgebraic Completeness Theorems
Abstract: In 1984, Robin Milner published a complete axiomatisation of bisimulation for a certain algebra of regular behaviours. His completeness proof is essentially an adaptation of Salomaa's proof of completeness for Kleene Algebra, but in the remark proceeding the proof he expresses his surprise that such an adaptation could be made. I will give an alternative proof of completeness using a general approach, of which Salomaa's completeness proof is an instance, and instantiate the approach in a few other contexts.
Dusko Pavlovic (University of Hawaii, USA)
Time: 11:00
Date: 18.10.2021
Room: online
Title: Geometry of computation and string-diagram programming in monoidal computer
Abstract: A monoidal computer is a monoidal category with a distinguished type carrying the structure that makes it into a single-instruction programming language. The instruction would be written as "run", but it is usually drawn. Equivalently, the monoidal computer structure can be viewed as a typed lambda-calculus without the lambda abstraction operation, even implicit. In contrast with combinatory algebras and nonextensional lambda calculi, a monoidal computer may not contain any extensional retracts, and allows direct representations of the intensional models of computation, like Turing machines and partial recursive functions.
Monoidal computer can thus be added as yet another item to the Church-Turing list of models of computation. It differs from other models by its categoricity. While the other Church-Turing models can be programmed to simulate each other in many non-isomorphic ways, and each interprets even itself in many non-isomorphic ways, the structure of a monoidal computer is unique up to isomorphism. A monoidal category can be a monoidal computer in at most one way, just like it can be closed in at most one way, up to isomorphism. In other words, being a monoidal computer is a property, not a structure. This displays computability as a categorical property, like the completeness of an order. This opens an alley towards an abstract treatment of parametrized complexity, one-way and trapdoor functions on one hand, and of algorithmic learning on the other.
Marnix Suilen (Radboud University, NL)
Time: 10:30
Date: 14.10.2021
Room: online
Title: Anytime Learning and Verification of Uncertain Markov Decision Processes
Abstract: Markov decision processes (MDPs) are formal models commonly used in sequential decision-making. MDPs capture uncertainties that may arise, for instance, from sensor imprecision or unpredictable human behavior by precise probabilities in the transition function. However, precise probabilities may not accurately reflect uncertainties that are based on data. Consequently, uncertain MDPs (uMDPs) exhibit so-called uncertainty sets in the transitions. Robust verification for uMDPs refers to determining a decision-making policy that provably adheres to formal specifications, for instance safety constraints. While verification for uMDPs is well-studied, learning uncertainty sets is a more open problem. We propose an anytime learning approach that combines a dedicated Bayesian inference scheme with robust verification of uMDPs that (1) over-approximates probabilities as intervals, (2) adapts to new data that may be inconsistent with an intermediate model, and (3) may be stopped at any time to perform robust verification on the uMDP while faithfully capturing the data so far. We show the effectiveness of our approach and compare it with precise probability estimates in an experimental evaluation on several standard benchmarks.
Diego Nascimento Damasceno (Radboud University, NL)
Time: 10:30
Date: 07.10.2021
Room: online
Title: Quality Guidelines for Research Artifacts in Model-Driven Engineering
Abstract: Sharing research artifacts is known to help people to build upon existing knowledge, adopt novel contributions in practice, and increase the chances of papers receiving attention. In Model-Driven Engineering (MDE), openly providing research artifacts plays a key role, even more so as the community targets a broader use of AI techniques, which can only become feasible if large open datasets and confidence measures for their quality are available. However, the current lack of common discipline-specific guidelines for research data sharing opens the opportunity for misunderstandings about the true potential of research artifacts and subjective expectations regarding artifact quality. To address this issue, we introduce a set of guidelines for artifact sharing specifically tailored to MDE research. To design this guidelines set, we systematically analyzed general-purpose artifact sharing practices of major computer science venues and tailored them to the MDE domain. Subsequently, we conducted an online survey with 90 researchers and practitioners with expertise in MDE. We investigated our participants' experiences in developing and sharing artifacts in MDE research and the challenges encountered while doing so. We then asked them to prioritize each of our guidelines as essential, desirable, or unnecessary. Finally, we asked them to evaluate our guidelines with respect to clarity, completeness, and relevance. In each of these dimensions, our guidelines were assessed positively by more than 92% of the participants. To foster the reproducibility and reusability of our results, we make the full set of generated artifacts available in an open repository at https://mdeartifacts.github.io/.
Reinier Joosse (Radboud University, NL)
Time: 10:30
Date: 01.09.2021
Room: online
Title: Evaluating Adversarial Attack Detectors using Formal Verification Methods
Abstract: Neural networks are known to be vulnerable to so-called "adversarial attacks": inputs that are deliberately slightly modified to make the networks misclassify them. Several defenses against these attacks exist. It is hard, however, to know if these defenses are effective against unseen types of attacks. One can test whether an attack detector detects old attacks, but an attacker can always come up with attacks that the detector was not trained against. In my thesis, I proposed two new metrics that estimate the quality of attack detectors independently of which attack method is used. The new metrics were shown to be computable using SMT solving techniques. The usefulness of the new metrics was demonstrated in experiments where their results were compared to the results of traditional test metrics.
Guillermo A. Pérez (University of Antwerp, BE)
Time: 10:30
Date: 21.07.2021
Room: online
Title: Continuous One-Counter Automata
Abstract: We study the reachability problem for continuous one-counter automata, COCA for short. In such automata, transitions are guarded by upper and lower bound tests against the counter value. Additionally, the counter updates associated with taking transitions can be (non-deterministically) scaled down by a nonzero factor between zero and one. Our three main results are as follows: (1) We prove that the reachability problem for COCA with global upper and lower bound tests is in NC2; (2) that, in general, the problem is decidable in polynomial time; and (3) that it is decidable in the polynomial hierarchy for COCA with parametric counter updates and bound tests.
Henk Barendregt (Radboud University, NL)
Time: 16:15
Date: 16.06.2021
Room: online
Title: rePhD: motivation, 2020 hindsight, and the making off
Abstract: Exactly today fifty years ago I earned my PhD on lambda calculus, with supervisors Dirk van Dalen and Georg Kreisel. In a recent republication (December 2020) the story is told what was the motivation, how this failed but brought new notions and results. As hindsight some principal later results and applications are discussed. Finally, some of the couleur locale (time & space) around the development of the work will be presented. Some extensional term models of combinatory logics and lambda calculi, A 2020 republication: motivation, the making of & hindsight Kindle Direct Publishing, December 2020.
Rick Erkens (Eindhoven University of Technology, NL)
Time: 10:30
Date: 02.06.2021
Room: online
Title: A Set Automaton to Locate All Pattern Matches in a Term
Abstract: Term pattern matching is the problem of finding all pattern matches in a subject term, given a set of patterns. It is a fundamental algorithmic problem in for instance automated theorem proving and term rewriting. We present a set automaton solution for the term pattern matching problem that is based on match set derivatives where each function symbol in the subject term is visited exactly once in a top-down fashion. The algorithm allows for various traversal patterns over the subject term and is particularly suited to search the subject term in parallel using a large number of simultaneously running threads.
Thorsten Wissmann (Radboud University, NL)
Time: 10:30
Date: 26.05.2021
Room: online
Title: Generic and Efficient Partition Refinement
Abstract: I present a generic partition refinement algorithm that quotients coalgebraic systems by behavioural equivalence, an important task in system analysis and verification. Coalgebraic generality allows to cover not only classical relational systems but also, e.g. various forms of weighted systems and furthermore to flexibly combine existing system types. Under assumptions on the type functor that allow representing its coalgebras in terms of nodes and edges, the algorithm runs in time O(m log n) for input coalgebras with n states and m edges. The generic complexity result and the possibility of combining system types yields a toolbox for efficient partition refinement algorithms. Instances of our generic algorithm match the run-time of the best known algorithms for unlabelled transition systems, Markov chains, deterministic automata, Segala systems, and for color refinement. For weighted tree automata, the instance of the generic algorithm even improves the best run-time known. The algorithm is implemented in a tool (CoPaR) in full generality and allows the composition of existing systems types at run-time.
Sebastian Junges (UC Berkeley, USA)
Time: 16:30
Date: 12.05.2021
Room: online
Title: Runtime Monitoring for Markov Decision Processes
Abstract: An important concern in autonomous systems is to ensure their correctness at runtime. We investigate the problem of monitoring partially observable systems with nondeterministic and probabilistic dynamics. In such systems, every state may be associated with a risk, e.g., the probability of an imminent crash. During runtime, we obtain partial information about the system state in form of observations. The monitor uses this information to estimate the risk of the (unobservable) current system state. Our results are threefold. First, we show that extensions of state estimation approaches for HMMs or Kripke structures do not scale due to the combination of nondeterminism and probabilities. While exploiting a geometric interpretation of the state estimates improves the practical runtime, this cannot prevent an exponential memory blowup. Second, we present a tractable algorithm based on model checking conditional reachability probabilities. Third, we provide prototypical implementations and manifest the applicability of our algorithms to a range of benchmarks. The results highlight the possibilities and boundaries of our novel algorithms. We employed these ideas in a simulation of landing aircrafts that check for runtime intrusion.
Dan Frumin (Groningen University, NL)
Time: 10:30
Date: 28.04.2021
Room: online
Title: On semantic proofs of cut elimination.
Abstract: In this talk I will discuss a of cut elimination that goes via algebraic semantics. This proof is, arguably, more modular than a standard direct proof via proof tree inversion. We will see how this proof can be applied to intuitionistic logic and the logic of bunched implications.
Marc Scholderman (Radboud University, NL)
Time: 10:30
Date: 21.04.2021
Room: online
Title: Efficient Verification of Optimized Code
Abstract: This talk presents the experiences of verifying a highly optimized library for performing public key cryptography on embedded devices. As a result of our effort, a somewhat peculiar bug was found and corrected.
More importantly, efforts like this often make painstaking use of dedicated tools that are inscrutible by non-experts. We think this is not really the case here: I will present how we did this work using off-the-shelf tools (the Why3 Verification Platform); relies purely on SMT solvers, re-uses an earlier verification effort, and arrived at a tidy specification.
Mart Lubbers (Radboud University, NL)
Time: 10:30
Date: 07.04.2021
Room: online
Title: Functional Pearl: Deep Embedding with Class
Abstract: The two flavours of DSL embedding in a functional language are shallow and deep embedding. In shallow embedding, the language constructs are represented by functions and the semantics are embedded in these functions. Adding new semantics is therefore cumbersome but adding constructs is a breeze. Upgrading the functions to typeclasses lifts this limitation to a certain extent. Deeply embedded languages represent their language constructs as data and the semantics over it as functions on this data. Seeing that the type of the construction is embedded in the semantics, adding new language constructs is laborious but adding semantics is trouble free. This pearl shows that by abstracting the semantics functions in deep embedding to typeclasses, it is possible to easily add language constructs as well. So called classy deep embedding results in DSLs that are extendible both in language constructs and in semantics while maintaining a concrete abstract syntax tree. Additionally, no type level trickery or complicated boilerplating is required to achieve this.
Martin Tappler (TU Graz, AT)
Time: 10:30
Date: 24.03.2021
Room: online
Title: Active Learning of Stochastic System Models
Abstract: Active automata learning enables model-based analyses for black-box systems that would otherwise be infeasible. While the field has progressed immensely in recent years, we still face various challenges especially in the presence of stochastic behaviour. In my talk, I will first discuss a technique for active automata learning from interactions with reactive, stochastic systems. The technique is based on Angluin's L* algorithm and samples system traces by testing the system under learning. It iteratively learns Markov decision processes that generalise the sampled traces.
After learning an MDP, we can finally analyse its properties with a probabilistic model-checker, but how useful are these analyses? We face the challenge that the learned model and consequently the derived model-checking results may be incorrect. This problem affects deterministic learning as well, but to a lesser extent, since correctness guarantees are easier to give for deterministic models. In the second part of my talk, I will discuss probabilistic black-box reachability checking. The technique applies automata learning and model checking to derive policies for bounded reachability properties of stochastic black-box systems. By evaluating the policies on the true system, the technique provides guarantees on reachability probabilities, although learned models may be inaccurate.
Bettina Könighofer (TU GRAZ, AT)
Time: 10:30
Date: 17.03.2021
Room: online
Title: Shields for Reinforcement Learning
Abstract: Reinforcement learning algorithms discover policies that maximize reward, but do not necessarily guarantee safety during learning or execution phases. In this talk, we discuss an approach to learn optimal policies while enforcing properties expressed in temporal logic. To this end, given a temporal logic specification, we synthesize a reactive system called a shield. The shield monitors the actions from the learner and corrects them only if the chosen action causes a violation of the specification. Besides safety issues, learned controllers have further limitations like: (1) Learned controllers are monolithic and it is not possible to add new features without retraining. (2) When facing un-trained unexpected behavior, the performance of learned controllers might be very poor or it can even result in complete system failure. We address these issues by formalizing deviations from optimal controller performance using quantitative run-time measurements and synthesize quantitative shields that ensure both optimal performance and safe behaviour.
Thom Badings (Radboud University, NL)
Time: 10:30
Date: 10.03.2021
Room: online
Title: Balancing Wind and Batteries: Towards Predictive Verification of Smart Grids
Abstract: We study a smart grid with wind power and battery storage. Traditionally, day-ahead planning aims to balance demand and wind power, yet actual wind conditions often deviate from forecasts. Short-term flexibility in storage and generation fills potential gaps, planned on a minutes time scale for 30-60 minute horizons. Finding the optimal flexibility deployment requires solving a semi-infinite non-convex stochastic program, which is generally intractable to do exactly. Previous approaches rely on sampling, yet such critical problems call for rigorous approaches with stronger guarantees. Our method employs probabilistic model checking techniques. First, we cast the problem as a continuous-space Markov decision process with discretized control, for which an optimal deployment strategy minimizes the expected grid frequency deviation. To mitigate state space explosion, we exploit specific structural properties of the model to implement an iterative exploration method that reuses pre-computed values as wind data is updated. Our experiments show the method's feasibility and versatility across grid configurations and time scales.
Thiago Dias Simao (TU Delft, NL)
Time: 10:30
Date: 17.02.2021
Room: online
Title: AlwaysSafe: Reinforcement Learning without Safety Constraint Violations during Training
Abstract: Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward signal that allows the agent to maximize its performance while remaining safe is not trivial. Safe RL studies how to mitigate such problems. For instance, we can decouple the safety from the reward using constrained Markov decision processes (CMDPs), where an independent signal models the safety aspects. In this setting, an RL agent can autonomously find such tradeoffs between performance and safety. Unfortunately, most RL agents designed for CMDPs only guarantee safety after the learning phase, which might prevent their direct deployment. In this work, we investigate settings where a concise abstract model of the safety aspects is given, a reasonable assumption since a thorough understanding of safety-related matters is a prerequisite for deploying RL in typical applications. Factored CMDPs provide such compact models when a small subset of features describe the dynamics relevant for the safety constraints. We propose an RL algorithm that uses this abstract model to learn policies for CMDPs safely, that is without violating the constraints. During the training process, this algorithm can seamlessly switch from a conservative policy to a greedy policy without violating the safety constraints. We prove that this algorithm is safe under the given assumptions. Empirically, we show that even if safety and reward signals are contradictory, this algorithm always operates safely and, when they are aligned, this approach also improves the agent's performance.
Diego Damasceno (Radboud University, NL)
Time: 10:30
Date: 10.02.2021
Room: online
Title: Learning by sampling: learning behavioral family models from software product lines
Abstract: Family-based behavioral analysis operates on a single specification artifact, referred to as family model, annotated with feature constraints to express behavioral variability in terms of conditional states and transitions. Family-based behavioral modeling paves the way for efficient model-based analysis of software product lines. Family-based behavioral model learning incorporates feature model analysis and model learning principles to efficiently unify product models into a family model and integrate new products-specific behavior of into a behavioral family model. Albeit reasonably effective, the exhaustive analysis of product lines is often infeasible due to the potentially exponential number of valid configurations. In this paper, we first present a family-based behavioral model learning techniques, called FFSM_Diff. Subsequently, we report on our experience on learning family models by employing product sampling. Using 105 products of six product lines expressed in terms of Mealy machines, we evaluate the precision of family models learned from products selected from different settings of the T-wise product sampling criterion. We show that product sampling can lead to models as precise as those learned by exhaustive analysis and hence, reduce the costs for family model learning.
Jules Jacobs (Radboud University, NL)
Time: 10:30
Date: 03.02.2021
Room: online
Title: Proving
deadlock freedom for session types with connectivity trees
Abstract: Concurrency constructs like locks and channels can result in deadlock, such that all threads are waiting for each other in order to make progress. Session types are a linear type system for channel protocols, that ensures that participants use channels according to their protocol. Session types also ensure that deadlock cannot occur, even when channels themselves are sent as messages over another channel. We present progress toward a formal proof of this in Coq. The proof has three interesting features: (1) the language for which we prove deadlock freedom is a linear lambda calculus and the operational semantics is low-level and explicitly models buffers, (2) we use separation logic to reason about channel references at a high-level, (3) the proof method is based a generic notion of "connectivity tree", which abstracts away the details of channels and may hence generalise to a language with both channels and locks and possibly other constructs.
Deivid Vale (Radboud University, NL)
Time: 10:30
Date: 27.01.2021
Room: online
Title: On Solving Nominal Equational Problems
Abstract: We define nominal equational problems as quantified formulae in conjunctive-normal-form built over equations and disequations. We give a general definition of a solution and a set of simplification rules to compute solutions in the nominal ground term-algebra. For the latter, we define notions of solved form from which solutions can be extracted and show that the simplification rules are sound, preserving, and complete. We also show that the simplification process terminates and provide an algorithm to solve nominal equational problems. These results generalize previous results obtained for first-order languages to languages with binding operators. In particular, we show that the problem of deciding the validity of a first-order equational formula in a language with binding operators, i.e., validity modulo alpha-equality, is decidable.
Kousar Aslam (Eindhoven University of Technology, NL)
Time: 10:30
Date: 20.01.2021
Room: online
Title: Conformance testing search strategies for active automata learning in industry: A case of Hybrid-ADS
Abstract: Active (automata) learning infers the behavioral models of software systems by iteratively formulating hypotheses for a system under learning, and testing whether the current hypothesis is equivalent to the system's behavior. Testing, usually implemented by a conformance testing technique (CTT), is computationally expensive, and known to be a bottleneck in the learning process. Considerable effort has been spent over the years to improve and develop new CTTs. Extending this research, we investigated the impact of using different search strategies for navigating the search space for a single CTT. An experimental evaluation of different search strategies for the state-of-the-art CTT, Hybrid- ADS developed by Smeenk et al, on 202 industrial software components is presented. Our results show that out of the eight strategies applied, the best strategy provided a 29% increase in the number of learned components as compared to the worst strategy. For several of the individual components, the difference between the best and worst strategy was even more pronounced, with the best strategy being able to learn the component, and the worst strategy failing to learn it, even after spending up to 1000 times more time. These results suggest that the strategy chosen for navigating through the search space is relevant for the performance of active learning process.
Natalia Sidorova (Einhoven University of Technology, NL)
Time: 10:30
Date: 13.01.2021
Room: online
Title: Mining models of weakly-structured processes
Abstract: Nowadays, state and event data are being logged in many processes of our life, but only a small portion of the data is being used to get insights into the processes that generated the data. One of the main difficulties in the analysis of this data is related to the nature of real-life processes: they are often utterly complex, involve multiple collaborating actors/resources, combine simple, clearly defined sub-processes with the “chaotic”, highly flexible ones. A natural way to address the complexity of a process is to decompose it. We focus on processes that can be modeled as communicating finite state machines. For each actor/resource involved in the process, we mine its process model as a state machine and then look into correlations between states of the mined state machines to highlight the interdependencies between them. The approach is illustrated with two case studies: a financial process and a process involving a smart product.
Herman Geuvers (Radboud University, NL)
Time: 10:30
Date: 06.01.2021
Room: online
Title: Relating Apartness and Bisimulation
Abstract: We have studied the dual of bisimulation: the notion of "apartness". Intuitively, two elements are apart if there is a positive way to distinguish them. Apartness is the dual of bisimilarity in a precise categorical sense: apartness is an initial algebra and gives rise to an induction principle. In the talk we will focus on the inductive nature of apartness, described by a least fixed point, which can therefore be defined via proof rules. In particular we look at weak forms of bisimulation on labelled transition systems, where silent (τ) steps are included. We define an apartness notion that corresponds to branching bisimulation. The proof rules for apartness can e.g. be used to show that two states of a labelled transition system are not branching bismilar. To support the apartness view on labelled transition systems, we cast a number of well-known properties of branching bisimulation in terms of branching apartness and prove them. And we indicate how a "distinguishing formula" can be derived from an apartness proof. (joint work with Bart Jacobs)
Dennis Groß (Radboud University, NL)
Time: 10:30
Date: 09.12.2020
Room: online
Title: A formal-methods toolchain to specify, learn, and verify RL agents
Abstract: Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision- making tasks that were previously out of reach for a machine. Most approaches toward reinforcement learning provide no guarantee about the safety of the learned controller or about the safety of actions taken during learning. Absence of safety guarantees becomes a crippling problem when reinforcement learning is applied to safety-critical environments where industry best practices demand evidence of safety, such as cars or planes. Storm is a probabilistic model checker for the analysis of systems involving random or probabilistic phenomena. Given an input model (e.g. through a PRISM file) and a quantitative specification, it can determine whether the input model conforms to the specification. It has been designed with performance and modularity in mind. In our research, we try to combine both worlds by extending the OpenAI Gym with an environment that uses storm. Additionally, we encode the trained deep reinforcement learning agent into PRISM which allows us to use well-known and effective probabilistic model checking techniques from Storm to verify the performance of the Deep RL Algorithm.
Thom Badings (Radboud University, NL)
Time: 10:30
Date: 02.12.2020
Room: online
Title: Filter-based abstractions for planning in partially observable dynamical control systems
Abstract: We study planning problems for dynamical control systems with limited sensing and imperfect actuation. Due to the inherent state uncertainty, such systems are only partially observable. Planning for these systems generally involves formulating an optimization problem in the belief space, which is intractable to solve exactly in the continuous domain. One solution is to formulate the problem as a discrete-state partially observable Markov decision process (POMDP), but this still renders the problem computationally expensive. In this work, we propose an alternative filter-based stochastic abstraction for partially observable dynamical control systems. Our abstraction is based on state estimation using Kalman filtering. The resulting model is formulated as an uncertain Markov decision process (uMDP), for which mature tools are available to provide guarantees on the system’s behavior. In this talk, we demonstrate how the proposed abstraction can be used to compute the probability to satisfy reachability specifications for partially observable systems with measurement and process noise. Moreover, we discuss various extensions to the proposed abstraction, including the use of a so-called local information controller as a heuristic to eliminate the history-dependency of optimal policies that may arise in some cases.
Benjamin Kaminski (University College London (UCL), UK)
Time: 10:30
Date: 25.11.2020
Room: online
Title: Quantitative Separation Logic
Abstract: We present quantitative separation logic (QSL). In contrast to classical separation logic, QSL employs quantities which evaluate to real numbers instead of predicates which evaluate to Boolean values. The connectives of classical separation logic, separating conjunction and separating implication, are lifted from predicates to quantities. This extension is conservative: Both connectives are backward compatible to their classical analogs and obey the same laws, e.g. modus ponens, adjointness, etc.
Furthermore, we present a weakest precondition calculus for quantitative reasoning about probabilistic pointer programs in QSL. This calculus is a conservative extension of both Ishtiaq's, O'Hearn's and Reynolds' separation logic for heap-manipulating programs and Kozen's / McIver and Morgan's weakest preexpectations for probabilistic programs. Our calculus preserves O'Hearn's frame rule, which enables local reasoning - a key principle of separation logic.
Finally, we briefly touch upon open questions regarding lower bounds on weakest preexpectations and intensional completeness of our calculus.
Jonas Kastberg Hinrichsen (University of Copenhagen, DK)
Time: 10:30
Date: 18.11.2020
Room: online
Title: Actris: session-type based reasoning in separation logic
Abstract: Message-passing is a principled approach to writing concurrent software. To verify message-passing behaviour in a scalable fashion, it is useful to have a first class approach to reasoning about it. Such an approach should integrate seamlessly with other low-level concurrency reasoning, as combining message passing with other concurrency mechanisms, such as locks, is commonplace.
In this talk I will present our work "Actris", a logical framework for thread-local reasoning about message passing, which combines separation logic with session types, originally presented at POPL'20. In doing so, I will cover the Actris protocol mechanism of "dependent separation protocols", showing how they can be used to verify a set of feature-rich examples, including the integration with lock-based concurrency. I will additionally present a recent extension of Actris to Actris 2.0, introducing the novel idea of "subprotocols", inspired by that of asynchronous session subtyping, to fully exploit the expressivity of asynchronous message passing. I will then show how dependent separation protocols have been used to prove type soundness of an expressive session type system, using the technique of semantic typing, to draw a formal connection between the dependent separation protocols and the session types that inspired them. Finally, I will provide insight into the model of Actris, based on step-indexing, and how we managed to fully mechanise it in Coq, by building it on top of the Iris logical framework.
Cynthia Kop (Radboud University, NL)
Time: 10:30
Date: 04.11.2020
Room: online
Title: Cons-free term rewriting
Abstract: Computational complexity is the study of resources (typically time and space) required to algorithmically solve a problem. Rather than analysing programs directly, the area of implicit complexity seeks to encode complexity queries into calculi or logics. One such approach is to use cons-free programming languages. By varying data order, this approach allows you to obtain a characterisation of a hierarchy of complexity classes, both in time and in space. In this talk, I will present the related approach of cons-free term rewriting systems. The additional of non-determinism turns out to have quite unexpected results.
Jules Jacobs (Radboud University, NL)
Time: 10:30
Date: 28.10.2020
Room: online
Title: Paradoxes of probabilistic programming, and how to condition on events of measure zero with infinitesimal probabilities
Abstract: Probabilistic programming languages allow programmers to write down conditional probability distributions that represent statistical and machine learning models as programs that use observe statements. These programs are run by accumulating likelihood at each observe statement, and using the likelihood to steer random choices and weigh results with inference algorithms such as importance sampling or MCMC. We argue that naive likelihood accumulation does not give desirable semantics and leads to paradoxes when an observe statement is used to condition on a measure-zero event, particularly when the observe statement is executed conditionally on random data. We show that the paradoxes disappear if we explicitly model measure-zero events as a limit of positive measure events, and that we can execute these type of probabilistic programs by accumulating infinitesimal probabilities rather than probability densities. Our extension improves probabilistic programming languages as an executable notation for probability distributions by making it more well-behaved and more expressive, by allowing the programmer to be explicit about which limit is intended when conditioning on an event of measure zero.
Bharat Garhewal (Radboud University, NL)
Time: 10:30
Date: 21.10.2020
Room: online
Title: Grey-Box Learning of Register Automata
Abstract: Model learning (a.k.a. active automata learning) is a highly effective technique for obtaining black-box finite state models of software components. Thus far, generalisation to infinite state systems with in- puts/outputs that carry data parameters has been challenging. Existing model learning tools for infinite state systems face scalability problems and can only be applied to restricted classes of systems (register automata with equality/inequality). In this article, we show how we can boost the performance of model learning techniques by extracting the constraints on input and output parameters from a run, and making this grey-box information available to the learner. More specifically, we provide new implementations of the tree oracle and equivalence oracle from RALib, which use the derived constraints. We extract the constraints from runs of Python programs using an existing tainting library for Python, and compare our grey-box version of RALib with the existing black-box version on several benchmarks, including some data structures from Python’s standard library. Our proof-of-principle implementation results in almost two orders of magnitude improvement in terms of numbers of inputs sent to the software system. Our approach, which can be generalised to richer model classes, also enables RALib to learn models that are out of reach of black-box techniques, such as combination locks.
Marnix Suilen (Radboud University, NL)
Time: 10:30
Date: 14.10.2020
Room: online
Title: Robust Policy Synthesis for Uncertain POMDPs via Convex Optimization
Abstract: We study the problem of policy synthesis for uncertain partially observable Markov decision processes (uPOMDPs). The transition probability function of uPOMDPs is only known to belong to a so-called uncertainty set, for instance in the form of probability intervals. Such a model arises when, for example, an agent operates under information limitation due to imperfect knowledge about the accuracy of its sensors. The goal is to compute a policy for the agent that is robust against all possible probability distributions within the uncertainty set. In particular, we are interested in a policy that robustly ensures the satisfaction of temporal logic and expected reward specifications. We state the underlying optimization problem as a semi-infinite quadratically-constrained quadratic program (QCQP), which has finitely many variables and infinitely many constraints. Since QCQPs are non-convex in general and practically infeasible to solve, we resort to the so-called convex-concave procedure to convexify the QCQP. Even though convex, the resulting optimization problem still has infinitely many constraints and is NP-hard. For uncertainty sets that form convex polytopes, we provide a transformation of the problem to a convex QCQP with finitely many constraints. We demonstrate the feasibility of our approach by means of several case studies.)
Thom Badings (Radboud University, NL)
Time: 12:15
Date: 11.10.2020
Room: HG02.052
Title: Formal Controller Synthesis for Dynamical Systems with Aleatoric and Epistemic Uncertainty
Abstract: Capturing uncertainty in models of complex dynamical systems is crucial to designing safe controllers. Stochastic noise causes aleatoric uncertainty, whereas imprecise knowledge of model parameters and the presence of external disturbances lead to epistemic uncertainty. Several approaches use formal abstractions to synthesize policies that satisfy temporal specifications related to safety and reachability. However, these approaches typically make strong assumptions about the uncertainty in the underlying models (e.g., only aleatoric or epistemic uncertainty), making these approaches unrealistic in practice. In this presentation, I present our recent work on abstraction-based controller synthesis for dynamical systems under both aleatoric and epistemic uncertainty. First, I present results for the case with only aleatoric uncertainty, in which we assume that the distribution of the stochastic noise is unknown. After that, I discuss the more general case with both aleatoric uncertainty (stochastic noise) and epistemic uncertainty (due to uncertain parameters and external disturbances). By sampling techniques and robust analysis, we capture the aleatoric and/or epistemic uncertainty, with a user-specified confidence level, in the transition probability intervals of a so-called interval Markov decision process (iMDP). We then synthesize an optimal policy on this abstract iMDP, which translates (with the specified confidence level) to a feedback controller for the continuous model with the same performance guarantees. With various experimental benchmarks, we show that our methods lead to controllers that are more robust against both aleatoric and epistemic uncertainty.
Frits Vaandrager (Radboud University, NL)
Time: 10:30
Date: 07.10.2020
Room: online
Title: A Myhill-Nerode Theorem for Register Automata and Symbolic Trace Languages
Abstract: We propose a new symbolic trace semantics for register automata (extended finite state machines) which records both the sequence of input symbols that occur during a run as well as the constraints on input parameters that are imposed by this run. Our main result is a generalization of the classical Myhill-Nerode theorem to this symbolic setting. Our generalization requires the use of three relations to capture the additional structure of register automata. Location equivalence ≡l captures that symbolic traces end in the same location, transition equivalence ≡t captures that they share the same final transition, and a partial equivalence relation ≡r captures that symbolic values v and v′ are stored in the same register after symbolic traces w and w′, respectively. A symbolic language is defined to be regular if relations ≡l, ≡t and ≡r exist that satisfy certain conditions, in particular, they all have finite index. We show that the symbolic language associated to a register automaton is regular, and we construct, for each regular symbolic language, a register automaton that accepts this language. Our result provides a foundation for grey-box learning algorithms in settings where the constraints on data parameters can be extracted from code using e.g. tools for symbolic/concolic execution or tainting. We believe that moving to a grey-box setting is essential to overcome the scalability problems of state-of-the-art black-box learning algorithms. (Joint work with Abhisek Midya.)
Robbert Krebbers (Radboud University, NL)
Time: 10:30
Date: 30.09.2020
Room: online
Title: Program Verification using Iris
Abstract: Over the last 5 years, together with a large network of international collaborators, we have developed *Iris*---a program verification framework based on concurrent separation logic embedded in the Coq proof assistant. Iris is program-language agnostic (it has been applied to e.g., variants of ML, Rust, C, Scala's core calculus DOT), and can be used to verify different properties (e.g., program correctness, data abstraction, security, program refinement, complexity) of both concrete programs and type systems. In this talk I will give an overview of the key ideas behind Iris and my journey as one of the lead developers of Iris.
See https://iris-project.org for more information.
Deivid Vale (Radboud University, NL)
Time: 10:30
Date: 23.09.2020
Room: online
Title: Tuple interpretations for Higher-Order Rewriting
Abstract: We present a style of algebra interpretations for many-sorted and higher-order term rewriting based on interpretations to tuples; intuitively, a term is mapped to a sequence of values identifying for instance its evaluation cost, size and perhaps other values. This can give a more fine grained notion for the complexity of a term or TRS than notions such as runtime or derivational complexity.
Sven Peldszus (University Koblenz, GER)
Time: 10:30
Date: 16.09.2020
Room: online
Title: Model-based Security and Software Quality Analysis in Variability-Rich Evolving Systems
Abstract: Today’s software systems tend to be used on a long-term basis, are highly interconnected, share many common parts and often process security-critical data, so that keeping up with ever-changing security precautions, attacks and mitigations is vital for preserving a system’s security. Model-based system development enables us to address security issues already in the early phases of the software design, as in UML models. The continuous changes in the security assumptions and the design of software systems — for instance, due to structural decay — have to be reflected in both the system models (e.g., UML models) and the system’s implementation (including program models). The detection of which change is necessary where has currently to be performed manually by developers. The same applies to the interaction of changes with security-related properties. In this talk, I introduce a model-based approach for developing and maintaining secure long-loving systems. The approach supports the continuous synchronization and mapping of models and code and allows developers to apply security checks on all representations utilizing the created mappings.
Dennis Groß (Radboud University, NL)
Time: 10:30
Date: 09.09.2020
Room: online
Title: Robustness Verification for Classifier Ensembles
Abstract: We give a formal verification procedure that decides whether a classifier ensemble is robust against arbitrary randomized attacks. Such attacks consist of a set of deterministic attacks and a distribution over this set. The robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack that induces a certain expected loss against all classifiers. We show the NP-hardness of the problem and provide an upper bound on the number of attacks that is sufficient to form an optimal randomized attack. These results provide an effective way to reason about the robustness of a classifier ensemble. We provide SMT and MILP encodings to compute optimal randomized attacks or prove that there is no attack inducing a certain expected loss. In the latter case, the classifier ensemble is provably robust. Our prototype implementation verifies multiple neural-network ensembles trained for image-classification tasks. The experimental results using the MILP encoding are promising both in terms of scalability and the general applicability of our verification procedure.
Joshua Moerman (RWTH Aachen University, DE)
Time: 10:30
Date: 15.07.2020
Room: Virtual
Title: Residuality and Learning for Nondeterministic Register Automata
Abstract: In this research we consider the problem of inferring a register automaton from observations. This has been done before for deterministic RA, but is still open for nondeterministic RA. To see why nondeterminism is interesting, consider the well-known learning algorithms L* and NL* for respectively deterministic and nondeterministic automata. Although the representation is different, they operate on the same class of languages (i.e., regular languages). This is not the case for RA, where nondeterminism gives a strictly bigger class of languages than determinism. So not only does the representation changes, so does the class of languages.
Our contributions are as follows. This is joint work with Matteo Sammartino. - We consider \emph{residual automata} for data languages. We show that their languages form a proper subclass of all languages accepted by nondeterministic RA. - we give a \emph{machine-independent characterisation} of this class of languages. For this, we also develop some new results in nominal lattice theory. - We show that for this class of languages, L*-style algorithms exist. - The natural generalisation of NL* does not always terminate, surprisingly. Fortunately, the algorithm can be fixed to always terminate.
Dan Frumin (Radboud University, NL)
Time: 10:30
Date: 08.07.2020
Room: Virtual
Title: Relational reasoning in Concurrent Separation Logic
Abstract: Relational reasoning plays an important role in programming languages, as many important properties of programs are relational. For example, contextual equivalence is used to show that optimized versions of data structures behave the same as their non-optimized counterparts and that concurrent data structures are linearizable. Non-interference is used to show that a program does not leak secret information.
In this talk I will describe recent developments in concurrent separation logic (specifically the Iris framework in the Coq proof assistant), that enable modular and mechanized relational reasoning in the context of higher-order programming languages with fine-grained concurrency.
Markus Klinik (Radboud University, NL)
Time: 10:30
Date: 01.07.2020
Room: Virtual
Title: no-bs: how to grade student assignments on Brightspace without using Brightspace
Abstract: The course Object Orientation has more than 400 students, who work in teams of two to hand in one assignment project every week.
In this talk we discuss the workflow and the tool no-bs that we have developed to streamline grading student assignments. no-bs is a command-line program that can interface with the Brightspace web API. It downloads submissions from Brightspace to your computer. You fill in a feedback template text file, and no-bs uploads grades and feedback to Brightspace.
The workflow we use has 17 teaching assistants and one coordinator, and is built around no-bs, the old bb-scripts, and exchanging data on a network share. Only the coordinator has to touch Brightspace, the teaching assistants just have to fill in the feedback template using their favourite text editor. no-bs has been developed to scratch our own itch, but it can be adapted to other workflows as well.
Jana Wagemaker (Radboud University, NL)
Time: 10:30
Date: 24.06.2020
Room: Virtual
Title: Partially Observable Concurrent Kleene Algebra
Abstract: In this talk I'll start with a general introduction to Kleene algebra and how it can be used to study simple programs. Hopefully, I'll be able to sketch the broader context in which to place the newest edition to the KA-family: Partially Observable Concurrent Kleene Algebra (POCKA), a sound and complete algebraic framework to reason about concurrent programs with control structures, such as conditionals and loops. POCKA enables reasoning about programs that can access variables and values, which I illustrate with a concrete example. The example revolves around an important check for sequential consistency.
Sven-Bodo Scholz (Radboud University, NL)
Time: 10:30
Date: 17.06.2020
Room: Virtual
Title: 50 Shades of Laziness
Abstract: Lazy evaluation has very appealing properties: it terminates whenever possible and it evaluates as little as needed for the overall result. The price for these properties is a loss of concurrency and a potentially uncontrollable need for space. Consequently, many languages, in particular those aiming for high parallel performance, employ eager evaluation. Attempts to combine the benefits of the two approaches are traditionally based on argument-specific annotations, special data types, or on strictness analyses. In this talk, we present a new approach to combining the two. The key idea is to leverage the binding time analyses from offline partial evaluation within the definition of the language semantics. This allows us to keep the favourable runtime behaviour of eager evaluation and yet acquire some of the benefits of lazy evaluation. We present our ideas in the context of the high-performance array programming language SaC.
Pieter Koopman (Radboud University, NL)
Time: 10:30
Date: 10.06.2020
Room: Virtual
Title: Dynamic Editors for Well-Typed Expressions
Abstract: Interactive systems can require complex input from their users. A grammar specifies the allowed expressions in such a Domain Specific Language, DSL. An algebraic DataType, ADT, is a direct representation of such a grammar. For most end-users a structured editor with pull-down menus is much easier to use than a free text editor. The iTask system can derive such structured editors based on an ADT using datatype generic programming. However, the input DSL has often also semantical constraints, like proper use of types and variables. A solution is to use a shallow embedded DSL or a DSL based on a Generalized ADT to specify the input. However, such a specification cannot be handled by datatype generic programming. Hence, one cannot derive structured editors for such a DSL.
As a solution we introduce structured web-editors that are based on dynamic types. These dynamic types are more expressive; they can express the required DSL constraints. In the new dynamic editor library we need to specify just the dynamic relevant for the DSL. The library takes care of displaying the applicable instances to the user and calls itself recursively to create the arguments of the dynamic functions. In this paper we show how this can be used to enforce the requires constraints on ADTs, to create structured web-editors for shallow embedded DSLS, and to create those editors for GADT based DSLs.
Niels van der Weide (Radboud University, NL)
Time: 10:30
Date: 03.06.2020
Room: Virtual
Title: Constructing Higher Inductive Types
Abstract: Higher inductive types (HITs) provide a way to define data types by describing how to construct inhabitants and when their inhabitants are equal. Examples of HITs include quotient types, finite sets, and finite bags. In this talk, we study HITs more closely in the setting of homotopy type theory, which is a version of type theory with proof relevant equality. More specifically, we show how to construct a certain class of higher inductive types as quotients.
Jurriaan Rot (Radboud University, NL)
Time: 10:30
Date: 20.05.2020
Room: Virtual
Title: Coalgebra Learning via Duality
Abstract: Automata learning is a popular technique for inferring minimal automata through membership and equivalence queries. We generalise learning from automata to a large class of state-based systems, using the theory of coalgebras. The approach relies on the use of logical formulas as tests, based on a dual adjunction between states and logical theories. This allows us to learn, e.g., labelled transition systems.
Joint work with Clemens Kupke and Simone Barlocco.
Daniel Strüber (Radboud University, NL)
Time: 10:30
Date: 13.05.2020
Room: Virtual
Title: Towards Model-Driven Search Engineering
Abstract: Optimization problems pervade every aspect of our lives, notably in domains such as healthcare, education, logistics, and finance, where limited resources must be distributed efficiently. For instance, a hospital needs to allocate staff, equipment, and beds in a way that ensures economic feasibility as well as patient dignity and privacy. A wealth of optimization techniques is available for addressing such problems. Choosing an optimization technique that matches the assumptions and requirements of a given problem, and applying it in a beneficial way, requires substantial knowledge on optimization techniques. We present our recent work on bridging the gap between high-level optimization requirements and low-level optimization techniques, based on model-driven engineering technology: First, on the automated generation of sound and complete search operators, in the application domain of software product line configuration. Second, on the automated generation of efficient search operators for arbitrary given problem domains. Third, on the automated analysis of search operators w.r.t. their impact on the consistency of solution candidates. Fourth, on first steps towards support for choosing from multiple back-end optimization technologies.
Gerco van Heerdt (University College London, UK)
Time: 10:30
Date: 29.04.2020
Room: Virtual
Title: Learning Weighted Automata over Principal Ideal Domains
Abstract: In this talk we discuss L*-based automata learning algorithms for weighted automata over a semiring. We provide a general adaptation of the algorithm and show that it works correctly when the semiring is a principal ideal domain. We also show that termination fails in general for an arbitrary semiring, in particular the natural numbers.
This is joint work with Clemens Kupke, Jurriaan Rot and Alexandra Silva.
Frits Vaandrager (Radboud University, NL)
Time: 10:30
Date: 22.04.2020
Room: Virtual
Title: State Identification for Labeled Transition Systems with Inputs and Outputs
Abstract: For Finite State Machines (FSMs), a rich testing theory has been developed to discover aspects of their behavior and ensure their correct functioning. Although this theory is widely used, e.g. to check conformance of protocol implementations, its applicability is limited by restrictions of the FSM framework: the fact that inputs and outputs alternate in an FSM, and outputs are fully determined by the previous input and state. Labeled Transition Systems with inputs and outputs (LTSs), as studied in ioco testing theory, provide a richer framework for testing component oriented systems, but lack the algorithms for test generation from FSM theory. In this article, we propose an algorithm for the fundamental problem of state identification during testing of LTSs. Our algorithm is a direct generalization of the well-known algorithm for computing adaptive distinguishing sequences for FSMs proposed by Lee & Yannakakis. Our algorithm has to deal with so-called compatible states, states that cannot be distinguished in case of an adversarial system-under-test. Analogous to the result of Lee & Yannakakis, we prove that if an (adaptive) test exists that distinguishes all pairs of incompatible states of an LTS, our algorithm will find one. In practice, such adaptive tests typically do not exist. However, in experiments with an implementation of our algorithm on a collection of (both academic and industrial) benchmarks, we find that that tests produced by our algorithm still distinguish more than 99% of the incompatible state pairs.
This is joint work with Petra van den Bos.
Freek Verbeek (Open University, NL)
Time: 10:00
Date: 08.04.2020
Room: Virtual
Title: Formal Proofs of Return Address Integrity
Abstract: We present a methodology for generating a characterization of the memory used by an assembly program, as well as a formal proof that the assembly is bounded to the generated memory regions. A formal proof of memory usage is required for compositional reasoning over assembly programs. Moreover, it can be used to prove low-level security properties, such as integrity of the return address of a function. Our verification method is based on interactive theorem proving, but provides automation by generating pre- and postconditions, invariants, control-flow, and assumptions on memory layout. As a case study, three binaries of the Xen hypervisor are disassembled. These binaries are the result of a complex build-chain compiling production code, and contain various complex and nested loops, large and compound data structures, and functions with over 100 basic blocks. The methodology has been successfully applied to 251 functions, covering 12,252 assembly instructions.
Peter Achten (Radboud University, NL)
Time: 10:30
Date: 01.04.2020
Room: Virtual
Title: Segments: a better Rainfall problem for Functional Programming
Abstract: Elliot Soloway's Rainfall problem (1986) is an extensively studied programming problem, intended to assess students' ability in constructing programs. Many of these studies have used imperative style programming languages. In 2014, Kathi Fisler investigated this problem for students in functional programming, and proposed an adapted, simplified, version of the Rainfall problem. In this talk I briefly recap Fisler's experiment and argue that it actually proves that this version of the Rainfall problem is too easy for students in functional programming and that it uncovers only a small variety of ways to construct a solution to the problem. Instead, I propose another problem, called Segments, that I have been using throughout my courses in Functional Programming (for AI) and I argue that this is a better suited problem for students in functional programming. These results are based on analyzing student assignment results of the past four iterations of the course.
David Parker (University of Birmingham, UK)
Time: 10:30
Date: 11.03.2020
Room: HG00.023
Title: Verification and Strategy Synthesis for Stochastic Games
Abstract: Stochastic multi-player games are a versatile modelling framework for systems that exhibit cooperative and competitive behaviour in the presence of adversarial or uncertain environments. Probabilistic model checking provides a powerful set of techniques to either verify the correct behaviour of such systems or to synthesise controllers that offer guarantees on safe, reliable or timely operation. This talk will provide an overview of work in this direction, implemented within the PRISM-games model checker. This includes recent extensions to concurrent stochastic games and equilibria-based properties, and an applications to a variety of domains, from energy management to communication protocols to robot navigation.
In this presentation, I will present our active, coevolutionary and multi-objective approach to learning deterministic finite automata from examples, and its application to requirements engineering. Formal requirements specifications are usually produced, from vague user input, by requirements engineers. It is our aim to automate this process.
From user input in the form of positive and negative example behavior of a desired system, we infer automata that completely describe its behavior. The setting of requirements engineering yields specific challenges, like a small amount of input data and the selection of the right automata from the many automata that are consistent with the input data. Furthermore, the number of queries to the user should be as small as possible.
I will show how our approach deals with these challenges and discuss open questions and future work.
Practising my presentation on ICTSS of the paper "n-Complete Test Suites for IOCO"
Omar is a PhD student at the TU/e, applying learning at various companies.
the SUMBAT project (SUperSizing Model-BAsed Testing) will organize a small workshop, presenting first project results.
10:05 Model-Based Testing with TorXakis - Pierre van de Laar (TNO-ESI)
10:30 Model-Based Testing at Oce - Ramon Janssen (RU)
11:10 Model Learning at PANalytical - Jeroen Meijer (UT)
11:35 Model-Based Testing with Complete Test Suites - Petra van den Bos (RU)
Alexis will crash test his presentation for ISoLA conference.
Tanja Vos will present her testing tool Test*.
Paul will wish you a very nice holiday, but before that, he will touch on the key aspect of the determinizer in Tomte, how it actually works in the tool, limitations and discuss a move to a more symbolic framework.
Ramon will practice his talk at CAV about learning TCP with abstraction, and model checking with concretization of the abstract models.
In 1989 Rob van Glabbeek and Peter Weijland defined branching bisimulation as an alternative to weak bisimulation of Robin Milner. Frits Vaandrager and I formulated an algorithm with complexity O(mn) where m is the number of transitions of a labelled transition system and n is the number of states. This was quite a spectacular improvement over weak bisimulation which is cubic due to the transitive tau-closure. At that time there was also the O(mn) algorithm for strong bisimulation, by Kannelakis and Smolka, which was spectacularly improved by Paige and Tarjan to O(mlog n). To me it was open whether the algorithm for branching bisimulation could be improved, until a discussion with Anton Wijs about implementing the algorithms on a GPU, brought me to investigate the paper of Paige and Tarjan in relationship to the branching bisimulation algorithm again. This led to the insight to obtain the current algorithm (except for a small and easily repairable flaw, pointed out by Jansen and Keiren). The algorithm is amazingly complex, but it outperforms existing algorithms by orders of magnitude, especially if systems become large.
A preprint of the paper can be found at http://arxiv.org/abs/1601.01478
Petra will practice her talk for iFM 2016
In this talk, I will extend the basic system relations of trace inclusion, trace equivalence, simulation, and bisimulation to a quantitative setting in which propositions are interpreted not as boolean values, but as real values in the interval [0,1]. Trace inclusion and equivalence give rise to asymmetrical and symmetrical linear distances, while simulation and bisimulation give rise to asymmetrical and symmetrical branching distances. I will present the relationships among these distances, and we provide a full logical characterization of the distances in terms of quantitative versions of LTL and μ-calculus. I will show that, while trace inclusion (resp. equivalence) coincides with simulation (resp. bisimulation) for deterministic boolean transition systems, linear and branching distances do not coincide for deterministic quantitative transition systems. Finally, I will go into algorithms for computing the distances.
(This work dates back to when I was at UC Santa Cruz and is joint work with Luca de Alfaro and Marco Faella)
trial presentation ICT.OPEN
Rick will practice his talk for LATA 2016 about finding minimal separating sequences for all pairs of state in O(n log n).
David will talk about a paper by Jan Friso Groote Anton Wijs: http://arxiv.org/abs/1601.01478
I will report on my research internship.
I will discuss the differences between the various learning algorithms for register automata. This includes the abstract learning algorithm from nominal automata, our tool Tomte and RAlib from Uppsala (and maybe more).
As far as I know, there only exist random test methods to generate test cases from an LTS/IOTS-model. I will discuss an adapted version of the FSM-based Lee and Yannakakis-method to find adaptive distinguishing sequences. These sequences enable a smarter and more directed way of testing.
As part of the Cyber-Physical Systems project in partnership with Océ, my current goal is to use Machine Learning (and other) techniques in order to predict printers failures. I will present here not only the current achievements or Machine Learning techniques employed, but also the main challenges to be completed. My presentation will also trigger a debate on the multidisciplinarity of the project.
I will give a brief overview on our CEGAR based algorithm and detail on how we have extended this algorithm to handle fresh output values. I will also talk on some of the optimizations we've made to the algorithm, and to the tool implementing it. The presentation is going to be a draft version of what I am going to present in Colombia next week.
A definitional interpreter defines the semantics of an object language in terms of the (well-known) semantics of a host language, enabling understanding and validation of the semantics through execution. Combining a definitional interpreter with a separate type system requires a separate type soundness proof. An alternative approach, at least for pure object languages, is to use a dependently-typed language to encode the object language type system in the definition of the abstract syntax. Using such intrinsically-typed abstract syntax definitions allows the host language type checker to verify automatically that the interpreter satisfies type soundness. Does this approach scale to larger and more realistic object languages, and in particular to languages with mutable state and objects?
In this paper, we describe and demonstrate techniques and libraries in Agda that successfully scale up intrinsically-typed definitional interpreters to handle rich object languages with non-trivial binding structures and mutable state. While the resulting interpreters are certainly more complex than the simply-typed lambda-calculus interpreter we start with, we claim that they still meet the goals of being concise, comprehensible, and executable, while guaranteeing type safety for more elaborate object languages. We make the following contributions: (1) A _dependent-passing style_ technique for hiding the weakening of indexed values as they propagate through monadic code. (2) An Agda library for programming with _scope graphs_ and _frames_, which provides a uniform approach to dealing with name binding in intrinsically-typed interpreters. (3) Case studies of intrinsically-typed definitional interpreters for the simply-typed lambda-calculus with references (STLC+Ref) and for a large subset of Middleweight Java (MJ).
This is joint work with Casper Bach Poulsen, Andrew Tolmach, Robbert Krebbers, Eelco Visser.
Active automata learning is emerging as an effective bug finding technique, with applications in areas such as banking cards, network protocols and legacy software. Timing often plays a crucial role in these applications, but cannot be handled by existing learning algorithms. Even though there has been significant progress on algorithms for active learning of timed models, these are not yet broadly applicable due to limited expressiveness and/or high complexity. In order to address this problem, we introduce a new model of Mealy machines with timers (MMTs) that is able to model the timing behavior of a broad class of practical systems, and present an algorithm for active learning of MMTs that uses a number of queries that is polynomial in the size of the corresponding canonical MMT. This is achieved by using, besides the usual membership and equivalence queries, lookahead queries to obtain information about whether a timer is active and may expire eventually.
Term rewriting with logical constraints offers a natural way to model procedural programs. In this talk, we will discuss how this works, and how the resulting constrained term rewriting systems can be analysed for equivalence using rewriting induction.
The subject of this talk are motion planning problems where agents move inside environments that are subject to uncertainties and potentially not fully observable. The goal is to compute a strategy or a set of strategies for an agent that is guaranteed to satisfy certain safety or performance specifications. Such problems are naturally modeled by Markov decision processes (MDPs) or partially observable MDPs (POMDPs). We discuss several technical approaches, ranging from the computation of permissive strategies that guarantee safe reinforcement learning in unknown environments, a game-based abstraction framework for POMDPs, as well as the utilization of parameter synthesis for Markov chains to compute randomized strategies for POMDPs. We also consider preliminary work on actively including humans into verification and synthesis processes, and what challenges arise.
In model-based testing, a specification is always the starting point to derive test cases in an automated way. In ioco-theory, a specification is a labelled transition system with inputs, outputs and quiescence. Generation of test cases has been formally described, including the desired properties of soundness and completeness. However, an emerging paradigm in testing is to avoid making complete specifications and specify the desired behaviour directly as test cases. Deriving a formal specification from test cases thus inverses the usual test generation methods, and has not been explored in the ioco-setting. We establish a relation between specification models and test cases. We show that a test case is nothing more than a partial model, and that soundness and completeness of test cases can be viewed in terms of specification refinement. Furthermore, we consider test suites, which can only be expressed as a specification model with a special conjunction operator.
How do we ensure that self-driving cars, robots, power plants and Internet-of-things devices are safe and reliable? That is the topic of risk management, a multi-disciplinary area at the interplay of technical, social and organizational domains, with many interesting research challenges.
I will start out with a broad introduction on the what and why’s of risk management: what do companies do to reduce their risks? What are challenges in this field, and what role does (and: should) computer science play?
Second, I will go more technical in the topic of fault tree analysis, a very popular technique here, deployed by many institutions like NASA, ESA, Honeywell, Airbus, the FDA, Toyota, Shell etc.
I will elaborate how the deployment of stochastic model checking can boost the capabilities of fault tree analysis, making them more powerful, flexible and efficient, allowing one to analyze a richer variety of questions faster.
Further, I will discuss the role of maintenance, a crucial element in reliability engineering is. Maintenance reduced the number of failures and extends a system's life time. At the same time, maintenance is expensive, as it requires specialized personnel and equipment. As such, maintenance is a relevant optimization problem, trading of (planned / unplanned) downtime, and several cost parameters. Here, I will dive into the application of big data analytics in maintenance.
I will end the presentation with some research challenges.
The problem of learning a minimal consistent model from a set of labeled sequences of symbols is addressed from a satisfiability modulo theories perspective. We present two encodings for deterministic finite automata and extend one of these for Moore and Mealy machines. Our experimental results show that these encodings improve upon the state-of-the-art, and are useful in practice for learning small models.
In consistency-based diagnosis (CBD), faults are identified based on deviations from a normal behavior specification. This is suitable e.g. for new artifacts such as printers, trains, etc. Probabilistic approaches have been added into CBD for aiding the selection of a diagnosis over a potentially intractable number of candidates, thus providing means to rank diagnoses. Under partial observability, however, the interpretation of diagnostic methods is non-trivial or even misleading (e.g. the conflict measure by Flesch and Lucas).
In this work, we propose a new diagnostic measure that is able to better handle the partial observability scenario. It is based on the intuition that the simplest diagnosis should be preferred, unless there is strong evidence in favor of more complex diagnoses (in terms of #abnormalities). Moreover, we suggest that rejecting a diagnosis is non-trivial, because the involved probabilities might be very small (e.g. multiple components are observed), and also because distinguishing unlikely observations from rare observations is intuitively important. As a result, additional computational costs are introduced, which shall be discussed as well.
Joint work with Arjen Hommersom (RUN, Open University) and Peter Lucas (RUN, Leiden University).
The axioms of Kleene algebra are complete for language equivalence of regular expressions. In this talk I will discuss commutative Kleene algebra, where the multiplication (concatenation) operation is commutative. Combining various results from the literature, I will show that this is complete for regular expressions interpreted on languages over a singleton alphabet.
This is joint work with Damien Pous (ENS Lyon).
When writing a user interface (UI), the layout of its elements play an important role. Programmers should be able to specify the layout of UIs in an intuitive way, while being able to separate the concern of laying out the UI from the rest of the software implementation. Ideally, the same layout language can be used in multiple application domains, so the programmer only has to learn one set of layout concepts. In this paper we introduce such a general-purpose layout language. We obtain this language by abstracting from a layout language we have introduced in previous work for declaratively defining Scalable Vector Graphics (SVG). We show that this abstract layout language can be instantiated for multiple domains: the SVG library by which the language is inspired, ncurses-based text-based user interfaces, and iTasks. In all of these cases, a separation of concerns is maintained.
This is joint work with Jurriën Stutterheim, Bas Lijnse, Rinus Plasmeijer; presented at IFL 2016.
In this talk, I will give a high-level overview over the content of my thesis, which is concerned with programming of and reasoning about inductive-coinductive systems.
The classical notion of algorithm assumes that inputs are finite and that computations eventually terminate with an output. For example, sorting algorithms take (finite) lists as input and output a sorted list after a finite number of computation steps. Due to their finite structure, lists are considered as so-called inductive data. Moreover, sorting algorithms usually operate by sorting parts of the list and combine these parts, so that they are defined by iteration on the input. Associated to inductive data is the principle of induction, which allows us to prove properties of programs by induction on their input. Both iteration and induction are well-studied and ubiquitous principles in mathematics and computer science.
On the other hand, many systems and computations are assumed to run indefinitely, while still reacting to their environment. Examples of this are control systems, deterministic finite automata that recognise regular languages, and computations on real numbers. Such systems can be described by coiteration, which usually arises from operational semantics. The associated proof principle is consequently called coinduction and allows us to show that two systems exhibit the same behaviour. Also here, coinduction is well-known in form of the bisimulation proof principle in, at least, computer science.
Finally, in many cases we need to combine both types of computation, which leads to inductive-coinductive structures and reasoning principles. For instance, so-called liveness properties of reactive systems, that is, properties that something needs to happen eventually, is are mixed inductive-coinductive predicates. Here, we then need to combine both iteration and coiteration, and induction and coinduction for programming and reasoning.
All of this raises the need for programming languages that allow us to manipulate finite and infinite data simultaneously, and for principles to reason about such programs. The programs will be given in a typed language, thereby allowing us to make basic guarantees about correctness and termination. On the other hand, the reasoning principles will enable us to prove properties of the programs' behaviour. Preferably, these reasoning principles lead to proofs that can be checked automatically by a computer, so to avoid mistakes in proofs.
We apply model learning on three SSH implementations to infer state machine models, and then use model checking to verify that these models satisfy basic security properties and conform to the RFCs. Our analysis showed that all tested SSH server models satisfy the stated security properties. However, our analysis uncovered minor violations of the standard.
We want to compare and relate (infinite) sequences. A natural way to do so is by finite state transducers, that is, finite deterministic automata with output. If feeding a sequence A to a finite state transducer yields a sequence B as output, then we write A >= B. Roughly speaking, this means that A can be transformed to B by a program with finite memory. In this pre-order the smallest sequences are the ultimately periodic sequences. The structure of sequences with respect to this pre-order raises many questions, most of which are open. A sequence A is called atomic if for every sequence B with A >= B we either have B >= A or B is ultimately periodic. The sequence 1101001000100... in which the sizes of consecutive groups of zeros are consecutively 0,1,2,3,4,.. is known to be atomic. Adding or removing any finite part remains equivalent with respect to this pre-order. If we want a more fine-grained pre-order, we may switch to permutation transducers, that is for every state and symbol there is not only exactly one corresponding outgoing arrow, but also exactly one corresponding incoming arrow. Recent work with Wieb Bosma shows some remarkable results on the corresponding pre-order, also on two-sided infinite sequences.
We present a type and effect system for static analysis of programs written in a simplified version of iTasks. iTasks is a workflow specification language embedded in Clean, a general-purpose functional programming language. Given costs for basic tasks, our analysis calculates an upper bound of the total cost of a workflow. The analysis has to deal with the domain-specific features of iTasks, in particular parallel and sequential composition of tasks, as well as the general-purpose features of Clean, in particular let-polymorphism, higher-order functions, recursion and lazy evaluation. Costs are vectors of natural numbers where every element represents some resource, either consumable or reusable.
I will present an Angluin-style algorithm to learn nominal automata. These automata are acceptors of languages over infinite (structured) alphabets. Learning such automata is interesting from a theoretical perspective, but also from a more applied perspective. In fact, this research was motivated by the problem of learning register automata which are used in verification and testing of reactive systems. In my talk I will focus on nominal automata themselves and what it means to learn (nominal) automata.
This is joint work with Matteo Sammartino, Alexandra Silva, Bartek Klin and Michał Szynwelski.
We consider the task of constructing 2-3 trees. Given a sequence of elements we seek to build a 2-3 tree - in linear time - that contains the elements in symmetric order. We discuss three approaches: top-down, bottom-up, and incremental. The incremental approach is more flexible than the other two in that it allows us to interleave the construction work with other operations, for example, queries.
In discrete-time temporal processes, hidden Markov models (HMMs) are widely used to capture multiple distributions over a feature space via hidden variables. As HMMs are within the family of Bayesian networks models, their graphical structure represents only conditional independence statements that must be valid for every instantiation of the variables involved. On the other hand, representing asymmetric independences explicitly can reveal additional characteristics of the nature of the underlying process, while producing more compact models. In this talk, we discuss a model developed to represent asymmetric independences in HMMs as a function of the hidden state. Experiments on a medical problem of patients under psychiatric treatment are shown, and some advantages of the method are highlighted and compared with other methods, namely, a simple and first-second order tree-based HMMs.
Algorithms for active automata learning require an equivalence oracle. Such an oracle answers affirmative if a hypothesis is correct, and answers with an counter example if not. In practice we face the problem that no such oracles exist, so we have to use (model based) testing to search for a counter example. For a case study from Océ (>3000 state Mealy machine!) we need a pretty good test method which generates a small but complete (in some sense) test suite. I will give an elegant overview of the classical methods and describe a new algorithm we used for the Océ case study. This talk is (hopefully) interesting for people who like testing, automata theory or algorithms.
We describe an algorithm for finding minimal separating sequences for all pairs of inequivalent states of a Mealy machine. Our algorithm identifies the distinct equivalence classes of the machine by iteratively refining a partition of its states. In each iteration of the algorithm, a block of states in the partition is split based on the observation that for a certain input certain states in the block end up in a different block than others. We use this information to construct a sequence of minimal length for which states in one block produce different output that states in the other. Such minimal separating sequences are useful in testing an implementation of a machine with respect to its specification.
From the lambda-calculus it is known how to represent (recursive) data structures by ordinary lamvda-terms. Based on this idea one can represent algebraic data types in a functional programming language by higher-order functions. In this talk we compare the famous Church encoding of data types with the less familiar Scott and Parigot encodings.
We show that one can use the encoding of data types by functions in a Hindley-Milner typed language by adding a single constructor for each data type. By collecting the basic operations of a data type in a type constructor class and providing instances for the various encodings, these encodings can coexist in a single program.
We show that in the Church encoding selectors of constructors yielding the recursive type, like the tail of a list, have an undesirable strictness in the spine of the data structure. The evaluation of the recursive spine by the Church encoding makes the complexity of these destructors linear time. The same destructors in the Scott and the Parigot encoding requires only constant time and do not hamper lazy evaluation in any way. Moreover, the Church encoding has problems with sharing reduction results of selectors.
The Parigot encoding is a combination of the Scott and Church encoding. Hence, we might expect that it combines the best of both worlds. In practice it does not offer any advantage over the Scott encoding.
How should one define interactive programs in a functional style? This question has been answered many times in many different ways. In this talk we demonstrate a new way of answering this question. Our approach is based on old insights, the most important of which is that a solution must be *purely compositional* which has resulted in Graphics.Scalable, a compositional library for vector graphics based on SVG. The approach also builds on new insights, the most important of which is that behaviour should be expressed as tasks in the Task Oriented Programming paradigm.
The Manning and Automation Project is an effort by the Dutch navy to reduce manning on some ship classes while retaining their full functionality. Furthermore, considering political and economical developments, it is expected that future ships will be employed in increasingly complex and versatile missions. This calls for new Command and Control systems which support the reduced crew in the execution of their missions. In this setting, we are studying so-called resilient workflows. The goal is to build socio-technical systems, that is teams of humans and machines, which get their work done even if subsystems break down, crew members become unavailable, or unexpected situations disrupt standard procedures. In particular, we present ongoing work on calculating overall resource requirements in situations where systems transitively depend on each other in the presence of redundancy.
We created a rating model for test code quality that employs metrics to quantify completeness, effectiveness, and maintainability of test code. The model was calibrated against a benchmark of open source systems. The model was validated by empirically assessing the relation between test code quality, as measured by the model, and issue handling performance, as determined by mining the issue tracking systems associated to 18 open source systems.
Joint work with Dimitrios Athanasiou, Ariadi Nugroho, and Andy Zaidman. http://dx.doi.org/10.1109/TSE.2014.2342227
Most recent AI models that handle uncertainty are based on probability theory. Exact probabilistic inference algorithms for such models are able to exploit their probabilistic and logical structure. To handle continuous distributions one usually uses an approximate algorithm. Approximate algorithms have less capability of exploiting the logical structure of a problem and work poorly for rare events. However, their main drawback is that no guarantees about the quality of their results can be given. In this paper we present a new approximate inference method, yielding approximations of discrete and continuous distributions with a guaranteed maximal error. A comparison is made with sampling methods, showing experimentally that our method is at least competitive and sometimes superior for hybrid models that possess logical structure and express rare events.
In the context of probabilistic automata, time-efficient algorithms for probabilistic simulations have been proposed lately. The space complexity thereof is quadratic in the size of the transition relation, thus space requirements often become the practical bottleneck. In this paper, we propose a spae-efficient algorithm for computing probabilistic simulations based on partition refinement. Experimental evidence is given showing that not only the space efficiency is improved drastically: The experiments often require orders of magnitude less time. In practice, they are even faster tha the time-optimal algorithm by Crafa/Ranzato.
We present a new algorithm for active learning of register automata. Our algorithm uses counterexample-guided abstraction refinement to automatically construct a component which maps (in a history dependent manner) the large set of actions of an implementation into a small set of actions that can be handled by a Mealy machine learner. The class of register automata that is handled by our algorithm extends previous definitions since it allows for the generation of fresh output values. This feature is crucial in many real-world systems (e.g. servers that generate identifiers, passwords or sequence numbers). We have implemented our new algorithm in a tool called Tomte.
This presentation will discuss the modelling related research taking place in Luxembourg. In 2010 I moved from the Netherlands to Luxembourg to set up a
research team at the Tudor research centre. Essentially the Luxembourgish version of TNO/VTT/Fraunhofer. During the presentation I will briefly review the developments of the past years in the Luxembourgish context, as well as where we stand at the moment within the Luxembourgish landscape. In doing so, I will also position the existing collaborations with the Netherlands, including e.g. Radboud University and the HAN. I will also specifically highlight key research topics in which further collaboration between "Luxembourg" and MBSD could be initiated, as well as potential topics that might be addressed in future seminars.
About the speaker: Erik Proper is Adjunct Professor within the MBSD Department of iCIS. He chairs the Enterprise Engineering Team combining about 35 researchers from Luxembourg and the Netherlands. He is also Deputy Director of the IT for Innovative Services Department of the newly formed Luxembourg Institute for Science and Technology (LIST). The LIST was formed out of a merger of two public research centres: CRP Henri Tudor and CRP Gabriel Lippmann. The ITIS department currently involves about 110 researchers and engineers.
The theory of causal independence is frequently used to fa- cilitate the assessment of the probabilistic parameters of probability dis- tributions of Bayesian networks. Continuous time Bayesian networks are a related type of graphical probabilistic models that describe stochastic processes that evolve over continuous time. Similar to Bayesian networks, the number of parameters of continuous time Bayesian networks grows exponentially in the number of parents of nodes. In this talk, we look at causal independence in continuous time Bayesian networks. This new theory can be used to significantly reduce the number of parameters that need to be estimated for such models as well. We show that the noisy-OR model has a natural interpretation in the parameters of these models, and can be exploited during inference. Furthermore, we gener- alise this model to include synergistic and anti-synergistic effects, leading to noisy-OR synergy models.
Probabilistic automata are used to model processes as sequences of events or actions. In contrast, temporal Bayesian networks model sequences of states. In the NWO CAREFUL project, we aim to reconcile these two approaches in order to unravel care given to patients with particular illnesses. In this talk, ongoing work on the identification of states in probabilistic automata is discussed. An algorithm is proposed that makes use of the regular structure within these models. Furthermore, we show preliminary results where we apply this algorithm to data of patients who were treated for psychotic depression.
Dynamic systems are relevant to many fields, such as financial analysis, speech recognition, and medical diagnosis. Dynamic Bayesian networks have been developed to model dynamic systems by discretizing time with a fixed time interval. The finest time granularity leads to inflexibility when temporal processes in a dynamic system evolve at different rates. An alternative for modelling dynamic systems are continuous time Bayesian networks, where every random variable evolves at its own rate. Continuous time Bayesian networks are suitable for dealing with systems with multiple time granularities. However, the representation in continuous time Bayesian networks is less suitable for incorporating and validating probabilistic expert knowledge. In this paper, we present a novel formalism, hybrid time Bayesian networks, which combines discrete-time and continuous-time Bayesian networks and is suitable for dynamic systems with multiple time granularities. Also, we apply hybrid time Bayesian networks to predict myocardial infarction.
iTasks enables the rapid creation of multi-user web-applications. It enables rapid development by automatically generating form-based graphical user interfaces (GUIs) for any first-order type. In some situations, however, form-based GUIs are not sufficient and more complex user interfaces are needed instead. In this talk, we introduce a purely compositional library for creating interactive user interface components, based on Scalable Vector Graphics (SVG). Not only are all images purely compositional, interaction on them is specified by pure functions. The graphics library is integrated with iTasks in such a way that one can easily switch between the generic form-like GUIs and graphics-based user interfaces.
In recent work, we applied active learning techniques to learn fragments of the TCP communication protocol by observing its external behavior. We show that the different implementations of TCP tested (Windows 8 and Linux) induce different models, thus allowing for fingerprinting of these implementations. Moreover, the models show that the specification (rfc 793) is not met fully met. We will also show how results from this case study reflect on current work on extending learning techniques so that learning can be done automatically.
We consider the problem of learning Bayesian network structures by exploiting both data as well as experts’ opinions about the graph. In practice, experts will have different individual probabilities of correctly labelling the inclusion or exclusion of edges in structure of the network. Therefore, we propose in this paper to estimate the accuracy of experts and then exploit this accuracy during the learning of the structure. We use an expectation maximization (EM) algorithm to estimate the accuracies, considering the true structure as the hidden variable. As a second contribution, we develop a Bayesian score that considers the training data as well as the experts’ opinions to score different possible structures, and then a greedy search is done in the space of possible structures.
In Angluin’s L* Algorithm a learner constructs a sequence of hypotheses in order to learn a regular language. Each hypothesis is consistent with a larger set of observations and is described by a bigger model. From a behavioral perspective, however, a hypothesis is not always better than the previous one, because the length of a shortest counterexample that distinguishes a hypothesis from the target language may decrease. We present a simple modification of the L* algorithm that ensures that for subsequent hypotheses the length of a shortest counterexample never decreases, which implies that the distance to the target language never increases in a corresponding ultrametric. Preliminary exponential evidence suggests that our algorithm speeds up learning in practical applications by reducing the number of equivalence queries.
In clinical guidelines a sequence of actions can be devised to achieve a certain goal, for example, to determine how to treat a patient over a period of time. In practice, not all patients are handled according to guidelines, suggesting that uncertainty may exist in some degree. There exist datasets that track actual treatments, which can be used as a source for further construction of data-based models and then compared with protocol-based models. We propose a model constructed from actual care data that acts as a probabilistic view of protocols, in which patients belonging to some state of the model have a set of possible future states. In order to allow white-box insights about patients’ temporal evolution, we model states as arbitrary Bayesian networks over a set of given random variables. Thus, probabilistic relationships among variables of interest can be shown through time. In this presentation this problem will be described followed by a brief description of an initial proposal for the learning of the model from data.
High-tech systems increasingly depend on software: software controls, connects, and monitors almost every aspect of modern system operation. Therefore, overall system quality and reliability are more and more determined by the quality and reliability of the embedded software. This presentation will discuss some trends, issues, and challenges related to system quality and testing.
Some of the basic algorithms for learning the structure of Bayesian networks, such as the well-known K2 algorithm, require a prior ordering over the nodes as part of the input. It is well known that the accuracy of the K2 algorithm is highly sensitive to the initial ordering. In this presentation, we introduce the aggregation of ordering information provided by multiple experts to obtain a more robust node ordering. In order to reduce the effect of novice participants, the accuracy of each person is used in the aggregation process. The accuracies of participants, not known in advance, are estimated by the expectation maximization algorithm. Any possible contradictions occurred in the resulting aggregation are resolved by modelling the result as a directed graph and avoiding the cycles in this graph. Finally, the topological order of this graph is used as the initial ordering in the K2 algorithm. The experimental results demonstrate the effectiveness of the proposed method in improving the structure learning process.
Simulink/Stateflow is commonly used in industry for designing, simulating, and analyzing control systems. It has good support from Matlab for computation, however, it is lacking support for stochastic modeling. In this talk, we will first discuss about how to bring stochastic choice and stochastic timing in Stateflow, and then discuss how statistical model checking techniques can be used for analyzing stochastic Simulink/Stateflow diagrams.
In this talk we compare the expressiveness of the Task Oriented Programming iTask approach of specifying interactive GUI applications with ObjectIO and Racket bigbang. ObjectIO is representative for the large class of traditional desktop widget based toolkits aiming to provide the programmer with full access the underlying GUI toolkit in a functional style. In contrast, the Racket bigbang approach offers the much more restricted setting of a single window and canvas to which the programmer adds callback and image rendering functions in a pure functional style. We demonstrate that both the Racket bigbang and iTask approaches result in significantly smaller GUI specifications by means of a small case study of the game of tic-tac-toe.
Issues found in model-based testing of state-based systems are traces produced by the system under test that are not allowed by the model used as specification. It is usually easier to determine the error behind the reported issue when there is a short trace revealing the issue. Our model-based test system treats the system under test as a black box. Hence the test system cannot use internal information from the system under test to find short traces. This paper shows how the model can be used to systematically search for shorter traces producing an issue based on some trace revealing an issue.
Knowledge representation is an important topic in common-sense reasoning and Artificial Intelligence, and one of the earliest techniques to represent it is by means of knowledge bases encoded into logic clauses. Encoding knowledge, however, is prone to typos and other kinds of consistency mistakes, which may yield incorrect results or even contradictions with conflicting information from other parts of the same code. In order to overcome such situations, I propose a logic framework to debug knowledge bases that might be systems specifications. The approach has a strong theoretical framework on knowledge representation and reasoning, and a suggested on-line prototype where one can test logic programs. Such knowledge bases may have conflicting information and the framework can be used to locate the possible source of conflict. Besides, the system can be employed to identify conflicts of the knowledge base itself and upcoming new information, it can also be used to locate the source of conflict from a given inherently inconsistent static knowledge base. This talk is about an implementation of a declarative version of a future system that has been characterised to debug knowledge bases in a symbolic formalism, which can be generalised to other fields of research like abductive reasoning, probabilistic reasoning, diagnosis and software development, amongst others.
Keywords-Answer Set Programming; Dynamic Knowledge Representation; theory transformation; non-monotonic reasoning; belief revision; reasoning; (weak) constraints; high-order theories.
Parameterised Boolean Equation Systems (PBESs) are sequences of least and greatest fixpoint equations ranging over first-order logic formulae. Such equation systems are used to encode various verification problems, including model checking problems and equivalence checking problems. In this presentation, we focus on some recent developments in the theory for PBESs which have proven to be instrumental in developing techniques for simplifying PBESs prior to solving them. Among others, we describe a theory of abstraction for PBESs and compare this theory to existing abstraction theories for transition systems.
My talk is about a possibilities for relational integration of both relational action and vision. The real potential of relational representations is that states can \emph{share} information with actions (e.g. parameters, or more specifically \emph{objects}). Possibilities exist to define novel languages for \emph{interactive experimentation} with relational abstraction levels in the context of both complex visual input and complex behavioral output. This includes new types of interactions -- for example dealing with scarce human feedback, new types of experimentation -- for example incorporating visual feedback and physical manipulation, and new types of abstraction levels -- such as probabilistic programming languages. I will present first steps towards a more tight integration of relational vision and relational action for interactive learning settings. In addition I present several new problem domains. .
With the widespread adoption of electronic health/medical records (EHRs/EMRs) and continuing improvements in data collection, electronic health data is growing at a staggering rate. An emerging area of research tries to use this data, in combination with domain knowledge, to build models (e.g., predictive) to help with tasks such as clinical decision support. My research revolves around designing artificial intelligence and machine learning tools that can help clinicians and medical researchers analyze, interpret, and exploit the burgeoning collection of data and knowledge. However, analyzing such data poses significant technical challenges for learning and reasoning due to their relational schemas, longitudinal nature, the presence of latent structure, and the fact that different patients may have dramatically different numbers of entries in any given table, such as diagnoses or vitals. Furthermore, it is important to model the uncertain, non-deterministic relationships between patients' clinical histories and current and future predictions about their health status.
In this talk, I will present some of my research on statistical relational learning, which is able to effectively address these challenges. In the first part of the talk, I will focus on using structurd mammography reports to predict whether an abnormality is malignant. I will describe an algorithm for automatically constructing features from the data that leads to improved performance on a real-world dataset compared to both other machine learning algorithms and radiologists. In the second part of the talk, I will focus on predicting whether a patient will have an adverse reaction (ADR) to a medication. In particular, I will focus on the fact that latent information may be necessary to build accurate models. For example, a patient's clinical history records information about specific prescribed medications or specific disease diagnoses. It does not explicitly mention important connections between different medications or diagnoses, such as which other medications could have been prescribed to treat an illness, that may be relevant. I will describe an algorithm that, while learning a model, automatically discovers clusters of diseases or medicines that are informative for the specific prediction task. This leads to improved performance on three real-world ADR tasks.
The presentation describes experiences of applying model-based testing to LTE Radio Base Station at Ericsson. The "LTE", Long Term Evolution, is the global standard for the fourth generation of mobile networks. The presentation describes what kind of problems we faced during modelling, test generation and concretization. We provide a comparison between two model-based testing tools, based on experience of modelling and test generation of LTE functionality.
The eccentricity of a node in a graph is defined as the length of a longest shortest path starting at that node. The eccentricity distribution over all nodes is a relevant descriptive property of the graph, and its extreme values allow the derivation of measures such as the radius, diameter, center and periphery of the graph. The diameter is defined as the length of the longest shortest path over all vertices in the graph, i.e. the smallest eccentricity, and serves as a relevant property of all types of graphs that are nowadays frequently studied. Examples include social networks, webgraphs and routing networks. During this talk we propose a highly parallel algorithm based on eccentricity lower and upper bounds for computing the exact diameter of large real-world sparse graphs.
The spectrum of branching-time relations for probabilistic systems has been investigated thoroughly by Baier, Hermanns, Katoen and Wolf (2003, 2005), including weak simulation for systems involving substochastic distributions. Weak simulation was proven to be sound w. r. t. the liveness fragment of the logic PCTL\X, and its completeness was conjectured. We revisit this result and show that soundness does not hold in general, but only for Markov chains without divergence. It is refuted for some systems with substochastic distributions. Moreover, we provide a counterexample to completeness. In this paper, we present a novel definition that is sound for live PCTL\X, and a variant that is both sound and complete.
This presentation is a practice talk for my QEST conference publication at the end of August. (Well, if Volkswagen doesn’t prohibit it.)
One of the main challenges when applying mathematical optimization is to construct a mathematical model describing the properties of a system. In this paper, we demonstrate how to learn such a model entirely from data using methods from machine learning. We illustrate our approach using the following optimization problem: ordering the resources in sequential auctions as to maximize the social welfare. This social welfare depends on combinatorial preferences and budget constraints of the bidders. Since these are unknown to the auctioneer, we propose to learn them from historical data. Given the learned model, we propose a novel white-box method to find an optimal ordering. Specifically, we take the outcome of a regression tree and use several constructions to map it into integer linear programs that can be used for optimization. We use an auction simulator and design several experiments to test the performance of the proposed method. Our experiments show that optimization based on historical data results in high social welfare. Furthermore, we compare our white-box approach with a black-box best first search approach and show its advantages over black-box method.
We present CoCaml, a functional programming language extending OCaml, which allows us to define functions on coinductive datatypes parameterized by an equation solver. We provide numerous examples that attest to the usefulness of the new programming constructs, including operations on infinite lists, infinitary lambda-terms and p-adic numbers.
(Joint work with Jean-Baptiste Jeannin and Dexter Kozen, Cornell University)
Probabilistic logics combine the expressive power of logic with the ability to reason with uncertainty. Several probabilistic logic languages have been proposed in the past, each of them with their own features. In this paper, we propose a new probabilistic constraint logic programming language, which combines constraint logic programming with probabilistic reasoning. The language supports modeling of discrete as well as continuous probability distributions by expressing constraints on random variables. We introduce the declarative semantics of this language, present an exact inference algorithm to derive bounds on the joint probability distributions consistent with the specified constraints, and give experimental results. The results obtained are encouraging, indicating that inference in our language is feasible for solving challenging problems.
Metis project aims at study of techniques supporting development of large-scale dependable surveillance system-of-systems for maritime safety and security. Surveillance systems, such as Metis, typically comprise a number of heterogeneous information sources and information aggregators. Among the main problems of their deployment lies scalability of such systems with respect to a potentially large number of monitored entities. One of the solutions to the problem is continuous and timely adaptation and reconfiguration of the system according to the changing environment it operates in. At any given timepoint, the system should use only a minimal set of information sources and aggregators needed to facilitate cost-effective early detection of indicators of interest.
On the background of Metis prototype description, I will introduce a theoretical framework for modelling scalable information-aggregation systems. We model such systems as networks of inter-dependent reasoning agents, each representing a mechanism for justification/refutation of a conclusion derived by the agent. The algorithm facilitating continuous reconfiguration is based on standard results from abstract argumentation and corresponds to computation of a grounded extension of the argumentation framework associated with the system.
In my dissertation work I have had the opportunity to study the search and rescue operations of the Netherlands Coast Guard. Examples from this domain have largely influenced the evolution of the Task-Oriented Programming (TOP) paradigm which was the primary focus of my research. In my current post-doc project, the SENECA/Incidone project, we will continue our collaboration. The goal of this project is to develop and evaluate a small, yet fully operational, incident coordination tool based on our previous findings. This will allow us to test our TOP framework iTasks beyond the usual lab setting, and to develop it into a more robust basis for further TOP research. The talk will sketch the origins, the goals, the requirements and preliminary design of the Incidone tool we are working on.
Software systems are the engines of modern information society. Our ability to cope with the increasing complexity of software systems is limited by the programming languages we use to build them. Bridging the gap between domain concepts and the implementation of these concepts in a programming language is one of the core challenges of software engineering. Modern programming languages have considerably reduced this gap, but still require low-level programmatic encodings of domain concepts. Domain-specific software languages (DSLs) address this problem through _linguistic abstraction_ by providing notation, analysis, verification, and optimization that are specialized to an application domain. Language workbenches assist language designers with the _implementation_ of languages. However, DSL implementations rarely formally guarantee semantic correctness properties such as type soundness and behavior preservation of transformations, since current language workbenches provide no support for such verification, which leads to intricate errors in language designs that are discovered late.
Multimorbidity, i.e., the presence of multiple diseases within one person, is a significant health-care problem for western societies.
Diagnosis, prognosis and treatment in the presence of multimorbidity can be complex due to the various interactions between diseases.
To better understand the co-occurrence of diseases, we used several types of probabilistic graphical modeling.
In this talk we will elaborate on three aspects of our research.
1) Multilevel Bayesian networks were used to analyze co-occurrence of diseases in case of hierarchical health data.
This is illustrated in the cardiovascular area, showing synergistic effects between diseases, even when controlled for specific health risks.
2) A novel measure, based on the d-separation criteria in Bayesian networks, was proposed to identify the critical factors between two diseases.
This is illustrated in the oncological area for better understanding co-occurrences of malignant tumors.
3) The theory of qualitative probabilistic networks was extended to chain graphs to explore the dynamics of disease interactions with no natural direction.
In this talk I will explain what functional dependencies for multi-parameter type classes are and show some examples. This has been implemented in an experimental version of the Clean compiler. How this was added to a compiler that also supports uniqueness typing is explained. Termination problems of the context reduction algorithm with functional dependencies are discussed. A terminating algorithm that permits more type level computations like addition of Peano numbers is presented.
In this talk I will report on recent developments on UPPAAL and hint on current developments. In particular, I will present UPPAAL-SMC that can be used for performance analysis of hybrid systems.
Since testing is an expensive process, test selection has been proposed as a way to reduce such expense. A good selection of tests can be done using specification coverage functions. Model-based ioco theory, however, uses test suites which are not suitable for computing coverage because of interdependence of their test cases. We define a new test suite that overcomes such problems. Using such a test suite we also cast the test selection problem to a specification selection problem that aims at modifying the model to which a system under test must conform, in order to reduce the set of test cases. We also give a canonical representation for the newly defined test suite.
Weak simulation for probabilistic systems was defined by Baier, Hermanns, Katoen and Wolf (2003, 2005), including systems involving substochastic distributions. A goal was to establish a simulation relation that is sound and complete w. r. t. the liveness fragment of the logic PCTL\X. Unfortunately, the defined relation is only sound on finite Markov chains (without substochastic distributions) and is not complete. Together with Lijun Zhang and Lei Song, we found a new definition that hopefully is sound and complete.
TBA
In various domains, such as security and surveillance, a large amount of information from heterogeneous sources is continuously gathered to identify and prevent potential threats, but it is unknown in advance what the observed entity of interest should look like. The quality of the decisions made depends, of course, on the quality of the information they are based on. In this paper, we propose a novel method for assessing the quality of information taking into account uncertainty. Two properties - soundness and completeness - of the information are used to define the notion of information quality and their expected values are defined using a probabilistic model output. Simulation experiments with data from a maritime scenario demonstrates the usage of the proposed method and its potential for decision support in complex tasks such as surveillance.
I show how to map the problem of supervised learning of deterministic finite state automata to a vertex coloring problem. Essentially, such a translation consists of two types of constraints: inequality and equality constraints. Inequality constraints ensure that positive and negative input examples cannot end in the same state (be both accepted or rejected by the automaton). Equality constraints ensure that strings that share their last symbol have to end in the same state if their prefixes end in the same state. Equality constraints greatly complicate the translation to vertex coloring. In my previous work, we therefore translated these constraints to satisfiability instead. We now provide the first translation that encodes these constraints in a pure vertex coloring instance, which has been an open problem for many years.
I will also give an intuitive explanation of automata learning, and discuss why the distinction between equality an inequality constraints is important.
This work was accepted at the conference on algorithmic learning theory (ALT) 2012.
Heb je je al eens afgevraagd hoe een TomTom werkt? Of hoe een website je een reisroute voorstelt? In deze presentatie leer je hoe een computer de kortste reisroute uitrekent. Bijvoorbeeld hoe je het snelst van Arnhem naar huis kunt rijden. En dat is voor een computer moeilijker dan je zou denken! Maar natuurlijk zijn er allerlei trucjes die ervoor zorgen dat een TomTom toch binnen een paar seconden de juiste weg kan aanwijzen.
Learning models of machines comes very naturally to humans. They construct a mental model of the behaviour of machines while trying out different options. For machines this is much harder. Algorithms have been developed to learn the behaviour of systems and describe them in state machine models. Once such a model has been learned it can be used by software engineers to improve their software. They can simulate and analyse the behaviour of the system, test newer versions of the system and get insight in legacy systems of which no documentation exists.
This master thesis describes a case study done at Oce for the ITALIA project at the Radboud University Nijmegen. The goal of this research was to nd out whether state-of-the-art techniques and tools for automata learning are powerful enough to learn models of industrial control software. Specifically whether the LearnLib tool developed at the University of Dortmund was able to learn a model of the Engine Status Manager (ESM) developed at Oce. The ESM controls the transition from one status to another in a printer. It delegates status requests to the connected hardware components and coordinates their responses. Software like the ESM can be found in many embedded systems in one form or another. Although such the ESM may seem simple, the many details and exceptions involved make it hard to learn.
A system is learned in two alternating phases: A learning phase in which a hypothesis model is created, and a testing phase in which the algorithm searches for a counterexample. This is implemented in the LearnLib. In order to find counterexamples during the test phase using less queries novel techniques had to be developed. Although a correct model of the ESM could not be learnt during this research novel techniques were developed that greatly reduce the number of test queries needed.
In clinical practice one often has to deal with uncertainty in temporal processes. Probabilistic graphical models provide a means to model the uncertainty, but the time aspects require some further attention. Using a clinical case study (COPD patient monitoring), two topics will be touched upon: making predictions based on limited data and a first look at modelling temporal indeterminacy.
Arjen will present this paper at the ICML workshop on Machine Learning for Clinical Data Analysis.
Various machine learning techniques have been proposed for the development of prognostic models, including those based on Bayesian networks. An advantage of a Bayesian network compared to many other classifiers is that the model can provide insight by representing the temporal structure of the domain. While it has been shown that Bayesian networks can perform well in terms of classification accuracy, additional temporal domain knowledge can harm the classification performance in realistic applications. We propose to combine a naive parametric distribution with temporal domain knowledge, resulting in semi-causal Bayesian networks. We evaluate this approach in the development of a prognostic model for epithelial ovarian cancer, and argue that the model is understandable for domain experts and comparable to the performance of traditional prognostic models and tree-augmented naive classifiers.
Continuous Stochastic Logic (CSL) can be interpreted over continuous-time Markov decision processes (CTMDPs) to specify quantitative properties of stochastic systems that allow some external control. Model checking CSL formulae over CTMDPs requires then the computation of optimal control strategies to prove or disprove a formula. The paper presents a conservative extension of CSL over CTMDPs and exploits established results for CTMDPs for model checking CSL.
Joost Visser is Head of Research at the Software Improvement Group (SIG) and recently joined the department as part-time professor on Large Scale Software Systems. In this talk, Joost will give a short introduction of SIG's model-based methods for assessing and monitoring software systems. He will also provide an overview of some recent and ongoing research performed by SIG in collaboration with various research partners, and articulate some open research challenges.
As part of our ongoing research effort to better understand and support processes of (collaboratively) conceptualizing, expressing and communicating (about) models and specifications in a "model driven" system development context, we will present the current state of affairs. We will introduce the main theoretical concepts now underlying our approach, and also some more operational aspects: types of analysis underlying the development of our game setups, as well as some general aspects of the resulting game systems. Also we will present some experiments with Dialogue Game prototypes, and discuss some ongoing developments in industrial application of our ideas.
Formal models are abstract representations of reality. Therefore, the ability to form, comprehend and reason with abstractions is vitally important in modelling. However, it is also largely subject to individual differences. We hypothesize that shifts in abstraction levels aid the comprehension of domain knowledge, and that abstraction is facilitated by metacognitive components of executive control.
Humans can form abstractions on different levels, 3 of which have been identified in neuroscientific studies: concrete, medium abstract and highly abstract. Executive control involves behaviours such as inhibiting distractions, flexible shifting of attention, and monitoring progress both in oneself and in others.
We are currently studying modellers in real-life modelling situations. So far, we found that the degree to which participants have a mental representation of the domain affects abstraction. Formation of abstractions and shifts often concurred with explicit monitoring of task, goal, scope and other participants behavior. Reasoning within scope was also observed in stakeholders without any modelling background. Without a clear representation, few attempts at abstractions are made. There is no monitoring; instead, the behavioural pattern includes deviations from focus, difficulty understanding the scope of concepts and echoing peers. Apparently, shifts in abstraction levels aid comprehension only if a certain level of understanding is in place beforehand, and that individual differences may override training.
Coordinating rescue operations for incidents at sea can be a complex task. In this talk we present an ongoing project that aims to develop an incident coordination tool to support it. This tool, Incidone, is based on the specification outlined by Lijnse et al in “Capturing the Netherlands Coast Guard SAR Workflow with iTasks” and is therefore modeled after, but not necessarily limited to, the workflow of the Netherlands Coast Guard. The unique feature of Incidone is that it is the first tool of its kind developed using the Task-Oriented Programming paradigm. Therefore, we present the tool both from the perspective of its intended end-users as well as from the perspective of a software developer. The primary goal of the Incidone project is to provide an example of this method to developers of similar crisis management applications.
Model-based testing of single functions is usually based on logical properties as specification. In practice it appears to be rather hard to develop a sufficiently strong set of properties to spot all errors. The set of algebraic properties is often insufficinet. For model-based testing of state based systems one usually employs a state machine as model and a conformance relation between this specification and the software under test. For (abstract) data types we can use both ways of model-based testing. In this talk we compare the specification effort required to make an effective model and the effectiveness of the associated tests for some well-known data types. Our examples show that it can be easier to write state based specifications. Moreover, conformance based testing of data types is more effective and very efficient.
David will present his paper for the NASA Formal Methods Workshop with coauthors Flemming Nielson, Lijun Zhang.
This paper establishes connections between logical equivalences and bisimulation relations for hidden Markov models (HMM). Both standard and belief state bisimulations are considered.
We also present decision algorithms for the bisimilarities. For standard bisimilarity, an extension of the usual partition refinement algorithm is enough. Belief bisimilarity, being a relation on the continuous space of belief states, cannot be described directly. Instead, we show how to generate a linear equation system in time cubic in the number of states.
Abstraction is the key when learning behavioral models of realistic systems. Hence, in most practical applications where automata learning is used to construct models of software components, researchers manually define abstractions which, depending on the history, map a large set of concrete events to a small set of abstract events that can be handled by automata learning tools. In this article, we show how such abstractions can be constructed fully automatically for a restricted class of extended finite state machines in which one can test for equality of data parameters, but no operations on data are allowed. Our approach uses counterexample-guided abstraction refinement: whenever the current abstraction is too coarse and induces nondeterministic behavior, the abstraction is refined automatically. Using Tomte, a prototype tool implementing our algorithm, we have succeeded to learn - fully automatically - models of several realistic software components, including the biometric passport and the SIP protocol.
The Institute for Computing and Information Sciences of Radboud University Nijmegen and the Enterprise Engineering Team cordially invite you to attend Three public lectures on enterprise engineering.
We warmly welcome guest from both industry and academia.
Admission is free, but reservation is required due to limited seats: RSVP (for reservations or cancellations, please send an email to Ingrid Berenbroek: I.Berenbroek@cs.ru.nl)
There is a notable tension between, on the one hand, 'blue print' approaches to enterprise engineering and, on the other, the phenomenon that in many respects, enterprises ?just happen?. Of course things are not so black and white; let's explore! And of course: how can we possible deal with this in organizational design?
Erik Proper, Jan Dietz and James Taylor, all distinguished academic researchers with ample experience in business-IT alignment and organizational design, each approach this theme from their specific personal outlook, followed by a joint discussion.
Programme:
12:30 Arrivals and coffee
13:00 Opening, Stijn Hoppenbrouwers
13:10 Erik Proper:"Engineering and Emergence in Enterprises; a Story of Four Es"
13:55 Jan Dietz: "Enterprise Engineering and Enterprise Ontology"
14:40 Coffee
15:00 James Taylor: "Innovation and the Authoring of the Large Organization, Why the Problem?"
16:00 Discussion among the speakers, led by Stijn Hoppenbrouwers
17:00 Drinks
While Bayesian networks have proved themselves useful for analyzing health care data, they do not have the capability to learn from data that contains some hierarchical structure, which is often the case for real-world medical data. Since this may impact the validity of inference, specialized statistical techniques, such as multilevel regression, are often used to analyze such hierarchical data. In this paper, we combine Bayesian networks with multilevel analysis by introducing multilevel Bayesian networks. We apply these networks to analyze the problem of multimorbidity, which is a significant health-care problem for western societies, especially within the elderly. In particular, we used multilevel Bayesian networks to analyze clinical data from family practices in the Netherlands with the aim to predict heart failure and diabetes mellitus. We compare the resulting model to conventional methods, which reveals at least the same classification performance, but a better insight of interactions between multiple diseases.
In this talk, I will present two recent approaches for passively learning state machines. The general problem they aim to solve is to find an automaton model that best describes a given data set of strings (execution traces). One of the main algorithm for solving this problem for deterministic finite state machines is known as state-merging: - start with a tree-like automaton model and - iteratively combine pairs of similar states of this model - until all states are dissimilar Both approaches are based on this simple yet effective greedy learning technique.
The first approach starts with the greedy state-merging algorithm, but stops its execution prematurely in order to solve the remaining part exactly using the power available in modern SAT solvers. Using this technique, we convincingly won the recent Stamina DFA learning competition.
The second approach adds a transition-splitting operation to the state-merging algorithm that can be used to learn the time-bounds of a deterministic real-time automaton. The transitions of such an automaton can contain guards that bound the amount of time that the automaton can remain in the current state before firing (activating) the transition. The resulting transition-splitting and state-merging algorithm is an efficient technique to learn such models from data.
Recently there has been a trend of crowdsourcing in which problems traditionally solved by individuals are now solved decentralized by a group of people. Research has shown that this can lead to results having the same quality as when using a single expert. There are a number of factors, however, that can influence the quality of the labels produced by human annotators such as varying levels of expertise, random errors, and input quality/difficulty. In this talk I will discuss various methods to deal with these kind of factors in the context of Gaussian process regression with multiple labels.
This talk will consist of three parts:
Concepts from object orientation have been applied to many fields to facilitate solving complex real-world problems. Medicine is an example of such a complex field, where, however, also the modeling of uncertainty is of major importance. It is our belief that object orientation can also play a role in the medical field to make representing and reasoning with uncertain knowledge easier. However, there is little known about how ideas from object orientation affect the specification and use of probability distributions. In this paper it is studied in what way structured probabilistic models can be organized in class hierarchies. We will provide a theoretical foundation of probabilistic models with object orientation, which are called probabilistic class hierarchies. This is expected to offer a basis for the modeling of complex problems, such as those in medicine, from which the examples used in this paper come.
When computer scientists discuss the computational complexity of, e.g., finding the shortest path from A to B, their starting point typically is a formal description of the problem at hand, e.g., a graph with weights on every edge. Given such a formal description, either tractability or intractability of the problem is established, by proving that the problem enjoys a polynomial time algorithm, respectively is NP-hard. However, this problem description is in fact an abstraction of the actual problem of being in A and desiring to go to B: it focuses on the relevant aspects of the problem (e.g., distances between landmarks and crossings) and leaves out a lot of irrelevant details.
This abstraction step is often overlooked, but may well contribute to the overall complexity of solving the problem at hand. For example, it appears that "going from A to B" is easier to abstract: it is fairly clear that the distance between A and the next crossing is relevant, and that the color of the roof of B is typically not. However, when the problem to be solved is "make X love me", where the current state is (assumed to be) "X doesn't love me", it is hard to agree on all the relevant aspects of this problem.
In this talk, I'll propose a framework for capturing the notion of relevance when it comes to finding a suitable problem representation, with the ultimate goal of formally separating "hard to represent" and "easy to represent" problem instances. This presentation is a reworked version of the presentation I gave at the Dagstuhl seminar "Computer Science & Problem Solving: New Foundations"
In many fields of science researchers are faced with the problem of estimating causal effects among some quantities of interest from non-experimental data. One common approach is to assume that the data have been generated by a directed acyclic graph (DAG), and to recover as much as possible of this graph structure from the observed data. For the class of linear non-Gaussian acyclic models, Shimizu et al. (2006) showed that the full graph can be learned efficiently from data if all relevant variables are included in the analysis. However, in the presence of latent variables, such as unobserved confounders, inferring the full model is computationally very expensive (Hoyer et al., 2008). Hence, for such models it can be beneficial to have ways of efficiently inferring partial but in the large sample limit correct information about the causal effects. In this talk, I will present a recently developed algorithm to learn total pairwise causal effects for all those pairs which are part of a so called unconfounded set, i.e. a set which fulfills the assumption of including all relevant variables for the analysis. Furthermore, I will present a statistical test for detecting inconsistent estimates of the direct causal effect in such models.
Computer-science style reasoning about programs always benefits from notations that are consistent, are easy to calculate with, and that avoid where possible a dependence on auxiliary definitions that must be given in some surrounding narrative. We are more interested in dry, systematic prose --- and less interested in the mathematical poetry of objects that "enjoy" properties, where facts are said to be "easy to see" and thoughts are linked by whences, thences and hences.
Notations for elementary probability theory seem to respond especially to a formal-methods motivated reorganisation that recognises the essential role of free- and bound variables, of expressions standing in-line for functions and of step-by-step calculation--- at least when the probabilities are applied to program semantics.
In this talk I will present an alternative notation for elementary probability theory that is principally inspired by computer-science notations already in use elsewhere; and I will give examples of it that are based on well known probability puzzles.
The talk is stand-alone, and not at all complicated; but its purpose is not to help to solve puzzles. Rather the motivation for paying the cost of using an alternative notation is provided by the previous talk (and other works), where it allows a much more succinct and reliable presentation of the essential ideas.
Pareto analysis is a broadly applicable method to model and analyze tradeoffs in multi-objective optimization problems. The set of Pareto optimal solutions is guaranteed to contain the best solution for any arbitrary cost function or selection procedure. This work introduces a method to explicitly take uncertainty into account during Pareto analysis. A solution is not modeled by a single point in the solution space, but rather by a set of such points. This is useful in settings with much uncertainty, such as during model-based design space exploration for embedded systems. A bounding-box abstraction is introduced as a finite representation of Pareto optimal solutions under uncertainty. It is shown that the set of Pareto optimal solutions in the proposed approach still captures exactly the potentially best solutions for any cost function as well as any way of reducing the amount of uncertainty. During model-based design space exploration, for instance, design and implementation choices that are made during the development process reduce the amount of uncertainty. Steps in such a refinement trajectory can render previously Pareto optimal solutions suboptimal. The presented results provide a way to ensure that early selections in the refinement process remain valid.
The Shadow model for qualitative noninterference security of sequential programs is a denotational model, complete with a refinement relation that preserves both functional- and security properties (the latter within its terms of reference). It was derived from a series of "gedanken" experiments in program-refinement algebra, then applied to Kripke structures as used for logics of knowledge.
The Hyperdistribution model for quantitative noninterference was later constructed with the Shadow in mind, but essentially independently. It turns out to have strong structural links to Hidden Markov Models.
The technical component of this talk will be to describe the two kinds of semantics, i.e. the Shadow- (qualitative, possibilistic) and Hyperdistribution- (quantitative, probabilistic) structures we have built for noninterference with a refinement partial order (ie an implementation order that respects secrecy). Unusually, the two models will be described at the same time, interleaved; in this way I will try to bring out their similarities and differences.
If time permits, an example will be given of the kind of program algebra that results, and how the qualitative- and quantitative algebras can work together.
Modern software has to deal with a huge amount of data from different sources. Data can be stored at different locations, like in memory, in files, or in databases. Multiple threads, processes and remote machines can access the same data concurrently. The evolution of software also affects the data sources it uses. Over time one might use additional data and change where and how it is stored. Often this requires a huge amount of software refactoring, because data has to be retrieved, combined and updated in a different way.
In this paper a uniform way for dealing with different kinds of shared data sources is proposed. It allows pieces of code to abstract from how data sources are implemented, the only information given is the type of data which can be read and written. To deal with changing data sources, new data sources can be derived from existing ones using functional projections and composition. Therefore, code can be reused when the way data is stored is changed. This kind of abstraction perfectly fits in a pure functional language.
In Océ's professional printers, print quality has to be guaranteed under all circumstances. This is currently done by choosing conservative values for printing temperature and speed. In the OCTOPUS project, we have investigated the use of Bayesian networks to model the uncertain influences in a printer more precisely, resulting in less conservative values, and therefore higher productivity. In several case studies using simulated and real data, we show how these models can be used for control and media recognition. Furthermore, we show how these models can be created from expert knowledge, existing deterministic dynamic models, and available sensor data. This is joint work with Arjen Hommersom.
The product ASD:Suite of the Dutch company Verum provides a modeling approach which combines compositional model checking and code generation. The main approach of the tool suite will be explained and demonstrated from a user point of view. We present experiences with this tool for the design of control software for interventional X-ray systems at Philips Healthcare. This concerns formal interface definitions between the main system components and detailed design of control components. Finally, we discuss advantages and disadvantages of the approach.
For continuous-time Markov chains, the model-checking problem with respect to continuous-time stochastic logic (CSL) has been introduced and shown to be decidable by Aziz, Sanwal, Singhal and Brayton in 1996. The presented decision procedure, however, has exponential complexity. In this paper, we propose an effective approximation algorithm for full CSL.
The key to our method is the notion of stratified CTMCs with respect to the CSL property to be checked. We present a measure-preservation theorem allowing us to reduce the problem to a transient analysis on stratified CTMCs. The corresponding probability can then be approximated in polynomial time (using uniformization). This makes the present work the centerpiece of a broadly applicable full CSL model checker.
Recently, the decision algorithm by Aziz et al. was shown to be incorrect in general. In fact, it works only for stratified CTMCs. As an additional contribution, our measure-preservation theorem can be used to ensure the decidability for general CTMCs.
The functional programming languages Clean and Haskell have been around for a while. Over time, both languages have developed a large body of useful libraries and come with interesting language features. It is our primary goal to benefit from each other’s evolutionary results and to facilitate the exchange of sources between Clean and Haskell. This is achieved by using the existing Clean compiler as starting point, and implementing a double-edged front end for this compiler: it supports both standard Clean 2.1 and (currently a large part of) standard Haskell 98. Moreover, it allows both languages to seamlessly use many of each other’s language features that were alien to each other before. For instance, Haskell can now use uniqueness typing anywhere, and Clean can use newtypes efficiently. This has given birth to two new dialects of Clean and Haskell, dubbed Clean* and Haskell*. Additionally, measurements of the performance of the new compiler indicate that it is on par with the flagship Haskell compiler GHC.
EMV is the leading international standard for electronic payments using smartcards, initially developed by EuroPay, Mastercard and Visa. The EMV standard is extremely complex. It involves hundreds of pages of documentation and allows many options and configurations -- including proprietary ones.
We constructed a formal model of the EMV protocol suite in the functional programming language F#. Security analysis is then possible using the protocol verifier ProVerif, via a translation of F# to the applied pi calculus. In the analysis all known weaknesses are revealed.
State-of-the-art tools for active learning of state machines are able to learn state machines with at most in the order of 10.000 states. This is not enough for learning models of realistic software components which, due to the presence of program variables and data parameters in events, typically have much larger state spaces. Abstraction is the key when learning behavioral models of realistic systems. Hence, in most practical applications where automata learning is used to construct models of software components, researchers manually define abstractions which, depending on the history, map a large set of concrete events to a small set of abstract events that can be handled by automata learning tools. In this article, we show how such abstractions can be constructed fully automatically for a restricted class of extended finite state machines in which one can test for equality of data parameters, but no operations on data are allowed. Our approach uses counterexample-guided abstraction refinement: whenever the current abstraction is too coarse and induces nondeterministic behavior, the abstraction is refined automatically. Using a prototype implementation of our algorithm, we have succeeded to learn -fully automatically- models of several realistic software components, including the biometric passport and the SIP protocol.
The dynamic nature of crisis response operations and the rigidity of workflow modelling languages are two things that do not go well together. A recent alternative approach to workflow management systems that allows for more flexibility is the iTask system. It uses an embedded functional language for the specification of workflow models that integrates control-flow with data-flow in dynamic data-dependent workflow specifications. Although there is a variety of publications about the iTask workflow definition language (WDL) and its implementation, its applications have been limited to academic toy examples. To explore the iTasks WDL for crisis response applications, we have developed an iTask specification of the Search And Rescue (SAR) workflow of the Netherlands Coast Guard. In this specification we capture the mix of standard procedures and creative improvisation of which a SAR operation exists.
We discuss our experiences in building the computing core of a high performance, mission critical SOA / EDA system. The system involved is responsible for the correct registration, computation and payment of approximately 10 million social benefits per month. We will present:
Speaker: Gert Veldhuijzen van Zanten is senior MDD architect at Capgemini. As a member of the Advanced Solutions/FMDD group he works in the areas of model-driven development, domain specific languages and transformation architectures. Gert holds an MSc and a PhD from the University of Twente (The Netherlands). Before his current position at Capgemini, he has been post-doc at the Center for User-System Interaction in Eindhoven, IT consultant at Ordina, and assistant professor in computer science at the Radboud University Nijmegen.
Workflow management systems (WFMS) are software systems that coordinate the tasks human workers and computers have to perform to achieve a certain goal. The tasks to do and their interdependencies are described in a Workflow Description Language (WDL). Work can be organized in many, many ways and in the literature already more than hundred of useful workflow patterns for WDL's have been identified.
The iTask system is not a WFMS, but a combinator library for the functional language Clean to support the construction of WFMS applications. Workflows can be described in a compositional style, using pure functions and combinators as self-contained building blocks. Thanks to the expressive power of the underlying functional language, complex workflows can be defined on top of just a handful of core task combinators. However, it is not sufficient to define the tasks that need to be done. We also need to express the way these tasks are being supervised, managed and visualized. In this talk we report on our current research effort to extend the iTask system such that the coordination of work can be defined as special tasks in the system as well. We take the opportunity to redesign editors which share information and the combinators for defining GUI interfaces for tasks, such as buttons, menu's and windows. Even though the expressiveness of the resulting system increases significantly, the number of core combinators can be reduced. We argue that only a few general Swiss-Army-Knife higher order functions are needed to obtain the desired functionality. This simplifies the implementation significantly and increases the maintainability of the system. We discuss the design space and decisions that lead to these general functions for constructing tasks.
Wormhole switching is a switching technique nowadays commonly used in networks-on-chips (NoCs). It is efficient but prone to deadlock. The design of a deadlock-free adaptive routing function constitutes an important challenge. We present a novel algorithm for the automatic verification that a routing function is deadlock-free in wormhole networks. A sufficient condition for deadlock-free routing and an associated algorithm are defined. The algorithm is proven complete for the condition. The condition, the algorithm, and the correctness theorem have been formalized and checked in the logic of the ACL2 interactive theorem proving system. The algorithm has a time complexity in O(N^3), where N denotes the number of nodes in the network. This outperforms the previous solution of Taktak et al. by one degree. Experimental results confirm the high efficiency of our algorithm. This paper presents a formally proven correct algorithm that detects deadlocks in a 2D-mesh with about 4000 nodes and 15000 channels within seconds.
Testing Industrial Printers at Océ
In this talk I will present my work on model-based testing of printer control software at Océ. The controller behaves as a function, taking simple inputs and producing simple outputs. However, there are many input and output parameters, and the dependencies between them can be complex. I will present our approach at modelling this system and how we cope with the complexity.
Abstraction Learning within Model Learning
Model learning algorithms typically focus on small and finite state systems due to complexity problems. To handle large or infinite state systems, a mapper is placed in between the Learner and the SUT. This mapper can abstract the complex actions of the SUT into manageable abstract actions for the Learner. This mapper has to rely on a priori knowledge about the SUT. We present an approach to automatically learn such abstractions without any knowledge of the internal workings the SUT.
Managing chronic disease through automated systems has the potential to both benefit the patient and reduce health-care costs. We are developing and evaluating a monitoring system for patients with chronic obstructive pulmonary disease which aims to detect exacerbations and thus help patients manage their disease and prevent hospitalisation. We have carefully drafted a system design consisting of an intelligent device that is able to alert the patient, collect case-specific, subjective and objective, physiological data, offer a patient-specific in- terpretation of the collected data by means of probabilistic reasoning, and send data to a central server for inspection by health-care professionals. A first pilot with actual COPD patients suggests that an intervention based on this system could be successful.
The semantics of CP-logic can be intuitively explained by means of the causal process semantics. The outcome of the processes described by a CP-logic theory is a set of possible worlds with to each associated a probability. These semantics are intuitive and elegant but do not describe how to efficiently calculate the probability of these worlds. In our work we have investigated how to efficiently calculate the probability that the processes described by a CP-theory lead to a particular world.
In this talk we focus on three topics with respect to inference for CP-logic. First, we show how the process semantics relate to Bayesian networks and how we can apply known inference methods for Bayesian networks to CP-logic. Bayesian networks, however, are in essence based on conditional (in)dependencies while CP-logic contains contextual and causal (in)dependencies. The second topic centers around exploiting contextual and causal independence in Bayesian networks to improve the efficiency of inference. We propose a method that combines contextual variable elimination and multiplicative factorization called contextual variable elimination with overlapping contexts. Lastly, we discuss the first-order aspects of CP-logic and their impact on inference. This topic handles about the first-order Bayes-ball method that extracts the requisite part of a CP-theory necessary to answer a query on the first-order level. We show how this method simplifies the shattering operation necessary for lifted inference.
This presentation introduces a new modular representation of Markov decision processes (MDPs). This representation is based on the language of CP-logic augmented with non-deterministic choices. I will present the semantics of this new language and will propose a method for the formal verification of these models using probabilistic model checking. Furthermore, I will present an application of this logic to the analysis of clinical practice guidelines. As a bonus, I will tell you about your chances on surviving the flu season.
I will try to give a user friendly introduction into Gaussian processes (GPs). A GP is a non-parametric method that can be used for learning smooth functions from data. GPs provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade. Over the last three years I've obtained quite some experience with GPs and believe the method is an excellent tool for addressing many, many scientific problems. Attend and be convinced too!
In order to facilitate model-based verification and validation, effort is underway to develop techniques for generating models of communication system components from observations of their external behavior. Most previous work has employed regular inference techniques which generate modest-size finite-state models. They typically suppress parameters of messages, although these have a significant impact on control flow in many communication protocols. We present a framework, which adapts regular inference to include data parameters in messages and states for generating components with large or infinite message alphabets. A main idea is to adapt the framework of predicate abstraction, successfully used in formal verification. Since we are in a black-box setting, the abstraction must be supplied externally, using information about how the component manages data parameters. We have implemented our techniques by connecting the LearnLib tool for regular inference with the protocol simulator ns-2, and generated a model of the SIP component as implemented in ns-2.
Most Probable Explanation and (Partial) MAP are well-known problems in Bayesian networks that correspond to Bayesian or probabilistic inference of the most probable explanation of observed phenomena given full or partial evidence. These problems have been studied extensively, both from a knowledge engineering starting point as well as a complexity-theoretic point of view; algorithms, both exact and approximate, have been studied exhaustively. In this paper, we introduce two new notions of abduction-like problems in Bayesian networks, motivated from cognitive science, namely the problem of finding the most simple and most informative explanation for a set of variables, given evidence. We define and motivate these problems, show that these problems are computationally intractable in general, but become tractable when some particular constraints are met.
In multiprocessor systems with many data-intensive tasks, a bus may be among the most critical resources. Typically, allocation of bandwidth to one (high-priority) task may lead to a reduction of the bandwidth of other tasks, and thereby effectively slow down these tasks. WCET analysis for these types of systems is a major research challenge. In this paper, we show how the dynamic behavior of a memory bus and a USB in a realistic printer application can be faithfully modeled using timed automata. We analyze, using Uppaal, the worst case latency of scan jobs with uncertain arrival times in a setting where the printer is concurrently processing an infinite stream of print jobs.
In Embedded Systems, power is a very scarce resource. Both the diversity of applications and the demand for integration into single appliances increase, despite of power problems. Power consumption can be reduced by specializing chips, whereas integration increases when generalizing them. This is why Multi-Processor System-on-Chips (MPSoCs) have become popular; flexibility and integration by offering several processors, all with their own specialization, in a single chip. We like to look at MPSoCs as platforms, but - especially when real-time constraints are involved - the assignment of tasks to processors is a design-time process. This means that two applications that require the same resource can not be run simultaneously and that when a resource becomes faulty, a set of applications can no longer run. This talk is about a way to loosen this constraint, by means of a heuristic approach to run-time resource assignment.
(This presentation is an exercise talk for my QEST 2010 presentation next week.)
We introduce FORTUNA, the first tool for model checking priced probabilistic timed automata (PPTAs). FORTUNA can handle the combination of real-time, probabilistic and cost features, which is required for addressing key design trade-offs that arise in many practical applications. For example the Zeroconf, Bluetooth, IEEE802.11 and Firewire protocols, protocols for sensor networks, and scheduling problems with failures. PPTAs are an extension of probabilistic timed automata (PTAs) with cost-rates and discrete cost increments on states. FORTUNA is able to compute the maximal probability by which a state can be reached under a certain cost- bound (and time-bound). Although this problem is undecidable in general, there exists a semi-algorithm that produces a nondecreasing sequence of probabilities converging to the maximum. This paper presents a number of crucial optimizations of that algorithm. We compare the performance of FORTUNA with existing approaches for PTAs. Surprisingly, although PPTAs are more general, our techniques exhibit superior performance.
Lazy functional programs are appropriately evaluated by graph reduction, where graphs represent expressions with the possibility of sharing and recursive reference. There are high-performance implementations of graph-reduction for stock PCs, but the compilers are complex as PCs are not engineered for this kind of computing.
The Reduceron is a comparatively simple graph-reduction processor implemented in reconfigurable hardware (FPGA). In essence it performs sequential reduction by template instantiation. But the design exploits parallel memory access to realise each of the machine's reduction rules in a single clock-cycle. Other uses of low-level parallelism include dynamic analyses for graph-update avoidance and for the speculative simultaneous reduction of primitive redexes during the instantiation of function bodies.
Discounted CTL is a logic defined by de Alfaro, Faella, Henzinger, Majumdar, and Stoelinga with two main features: (1) Truth values cannot be just 0 or 1, but any real number in the interval [0,1]. (2) Discounting, i. e. a device to give more weight to the near future than to the far future. These features allows to weigh more exactly to which degree a model satisfies its specification. Recently, Axel Legay, Mariëlle Stoelinga and I have worked on adapting DCTL to a setting with continuous-time models. We have defined the semantics on continuous-time Markov decision processes and found model checking algorithms. The presentation will provide our (sometimes preliminary) results and invite discussion.
We present our experiences on teaching software engineering in teams which are organized around different R+D projects. These long-running, innovative projects are carried out in cooperation with industrial partners, and are supported by student exchange. While MSc and PhD students work together with faculty staff members on the projects in an industrial-like environment, the students develop skills that would be otherwise very hard for them to obtain. The methodological contributions are illustrated by, and substantiated with, the description of a concrete software engineering project.
Based on the work of Even and Meyer, Hosch and Landweber proved that the existence of the solution of Church's Problem by a strategy with finite delay is decidable. In our work we study Church's Problem in the framework of infinite games with MSO-conditions. We construct d-delay attractor strategies in reachability games over the corresponding d-delay game graphs, to determine the existence of winning strategies with a fixed delay for player Output. We illustrate that the function representing the size of the winning region of player Output in a certain game depending on the delay d is monotone, but not necessarily strictly monotone. The problem of deciding whether player Output has a winning strategy with finite delay is raised. We prove that player Output wins a reachability game with finite delay if and only if player Output has a winning strategy with delay 2^n over the corresponding game graph with n vertices. As a consequence, we obtain a new proof for the result of Hosch and Landweber.
Providing situated modeling support for domain experts, so that they may be actively involved in the modeling process, is our ideal scenario. To work towards this scenario, it is very important to understand the domain expert’s psychology. They are non-technical people who have a very different way of thinking than IT analysts, and therefore we have to appeal to their perspective when designing modeling support programs.
In this talk, I will give an overview of a study in which modeling novices and modeling experts are compared with regard to their ways of modeling. Using a very simple formalism, participants were asked to create an information domain model of a familiar domain, a library. We analyzed both the resulting models and the ways through which those models were created.
We have come up with a hypothesis of modeling behavior, which will now require further validation with a larger test group.
Networks-on-chips (NoC) are emerging as a promising interconnect solution for efficient Multi-Processors Systems-on-Chips. We propose a methodology that supports the specification of parametric NoCs. We provide sufficient constraints that ensure deadlock-free routing, functional correctness, and liveness of the design. To illustrate our method, we discharge these constraints for a parametric NoC inspired by the HERMES architecture.
Since its conception as an experimental library for expressing workflows in a functional language, the iTask system has evolved a lot. A major change we implemented last year has been a redesign of the system's core as a services based engine with an Ajax client application.In this talk I will give a short overview of these developments, and zoom in on some of the technical challenges we had to tackle, such as high-level generic interface generation and client-side interpretation. Additionally I will give a demonstration of the current version of the system.
This talk explores links between two widely used models of reactive systems: I/O automata and Mealy machines. We show how any algorithm for active learning of Mealy machines can be used for learning output deterministic I/O automata in the sense of Lynch, Tuttle and Jonsson. The main idea is to place a transducer in between the I/O automata teacher and the Mealy machine learner, which translates concepts from the world of I/O automata to the world of Mealy machines, and vice versa.
The most frequent type of cancer among women worldwide is breast cancer. When breast cancer is detected in an early stage, chances of successful treatment are high. Screening programmes have shown to reduce the mortality rate of breast cancer. Studies have shown that radiologists fail to identify a significant number of cases with breast cancer due to misinterpretation. To address the problem of interpretation failure by radiologists, the B-SCREEN project investigates the use of Bayesian networks and Bayesian classifiers. This study focuses on the learning of Bayesian networks from data that is available from the Dutch breast cancer screening programme using different structure and parameter learning techniques. The possibility to use these techniques is verified using experiments. This study concludes that it is possible to use structure and parameter learn- ing techniques to learn Bayesian classifiers that perform reasonable using data from breast cancer screening programmes. The network structures give insights in the correlation of certain variables in the breast cancer domain. However, the performance of these classifiers is still less than when other classification methods are used.
Medical screening is a method for early diagnosis of disease, usually administered to individuals without current symptoms, but who may be at high risk for certain adverse health outcomes. Apart from their proven benefits, screening programs have shown potential adverse effects such as high false positive rates (overdiagnosis) or missed diagnosis. In addition, the involvement of large populations often makes screening costly, time-consuming and creates high working pressure for human experts. As potential assistants to the screening readers, computer-aided detection (CAD) systems have emerged in the last 15 years. One typical application is the CAD for breast cancer screening programs. In this talk, I will present my recent research on the development of a breast cancer CAD system based on Bayesian network models. I will outline the potential benefits and limitations of the proposed methodology and demonstrate its practical application. Finally, I’ll give a short overview of my current work concerning the development of a decision support system using probabilistic graphical models for another medical domain - pregnancy care.
PORTFOLIO
Overal in onderwijsland is sinds een par jaar *portfolio* een modieus onderwerp. Bijna iedereen heeft al ervaring met iets dat zo genoemd wordt, velen hadden er last van en bijna niemand vindt het nuttig.
Maar wat is het eigenlijk? Dat wordt helaas vertroebeld door de vele systemen en databanken die zo genoemd worden.
Elke docent moet nu (1) alert zijn en (2) het heft in eigen handen houden om voorbeelden te helpen ontwikkelen voor iets dat zowel nuttig als plezierig om te doen is. Anders krijgen we door de faculteit of universiteit iets opgelegd wat veel werk en frictie veroorzaakt zonder bij te dragen aan goed onderwijs en de ontplooiing van studenten.
Laten we hierover na een korte introductie eens brainstormen.
We present a typical synergy between dynamic types (dynamics) and generalised algebraic datatype (GADTs). The former provides a clean approach to integrating dynamic typing in a statically typed language. It allows values to be wrapped together with their type in a uniform package, deferring type-unification until run-time using a pattern match annotated with the desired type. The latter allows to specify constructor types explicitly, as to enforce their structural validity. In contrast to ADTs, GADTs are heterogeneous structures since each constructor type is implicitly universally quantified. Unfortunately, pattern matching only enforces structural validity and does not provide instantiation information on polymorphic types. Consequently, functions that manipulate such values, such as a type-safe update function, are cumbersome due to boilerplate type representation administration. In this talk we focus on improving such functions by providing a new type annotation of GADT values via a natural synergy with dynamics. We describe the semantics of the type annotation and touch on novel other applications of this technique such as type dispatching and type equality invariants on GADT values.
In this talk, I'll discuss the claim from my Ph.D. thesis that procedures for probabilistic inference (in particular: exact inference in a discrete Bayesian network) are better expressed with relational algebra, the widely used language in which database queries are optimized. Relational algebra combines an efficient operational semantics with the advantages of algebraic transformation. In this framework, an inference query corresponds to a canonical relational expression, and a conventional inference procedure such as variable elimination can be seen as a heuristic to translate this expression into a more efficient one.
Going beyond conventional inference procedures, relational algebra can be used to minimize the cardinality (i.e. the number of nonzero probabilities) of intermediate relations, instead of the dimensionality (the number of probabilistic variables). I demonstrate this using a sensor data example.
For the non-probabilistically inclined, I'll be sure to start the talk with an introduction on probabilistic models and inference.
Designing distributed systems is not an easy task. Progress has been made in this regard with the development of formal specification languages and verification tools. One area that is usually not addressed is the deployment of a system. This is unfortunate as the deployment can be critical to the performance.
Finding an optimal deployment is a combinatorial optimization problem; problems that are challenging computationally, with many instances being NP-hard. Different techniques exist to solve these problems, however, what works best for a specific problem is difficult to predict. Mixed integer programming and constraint programming have been considered for the deployment problem in the past. This research extends on this work by investigating the use of constraint-based local search and hybrid methods.
The deployment problem of two distributed system architectures in particular is considered, and methods are presented to solve them. These methods offer novel and sophisticated integration of several techniques. They are not superior to the existing methods in every way, but strike different trade-offs between time that is spend searching for a deployment, and guarantees that are made on the quality of the obtained solution.
Probabilistic networks are widely used in decision support systems, for example in medical domains. It is known that many problems related to these networks are NP-hard and thus intractable in general. However, these problems may become tractable when the domain is restricted. For example, Probabilistic Inference is known to be NP-hard (in fact, is PP-complete) but is polynomial on networks with bounded treewidth.
In my PhD research at Utrecht University, I studied the computational complexity of a number of such problems, all of which are at least NP-hard since they have Probabilistic Inference as a (degenerated) special case. I showed that these problems are complete for classes which are assumed to be larger than NP, i.e., these problems are assumed to be 'more difficult' than NP-complete problems.
In this talk, I will discuss these problems and argue why it makes sense to study the exact complexity of these problems. After introducing the necessary background in complexity theory, I will show how hardness proofs for 'higher' complexity classes can be constructed, in particular for the Parameter Tuning problem which is NP^PP-complete, i.e. is as hard as any problem solvable with a non-deterministic Turing Machine that has access to an oracle for problems in PP. This problem remains NP-complete when restricted to polygraphs, and remains PP-complete for a bounded number of parameters; this is reflected by the current best known algorithm for Parameter Tuning which is exponential in both the treewidth of the graph and the number of parameters; our complexity results show that we can't expect do to much better than that, and that both restrictions are necessary to yield tractable results.
The results presented are based on joint work with Linda van der Gaag, Jan van Leeuwen, Hans Bodlaender and Gerard Tel.
In this talk we will present two recent contributions within the field of regularization.
Regularization techniques provide estimates for the linear regression coefficients solving the problems encountered in the "few samples and many variables" setting. The main idea is to shrink the coefficients to zero by imposing a penalty on their size. We will firstly review the main regularization techniques. We will then propose a method for the structure learning of Gaussian Bayesian networks.
The search in an equivalence class search space is combined with regularization techniques, promoting a sparse network learning. Finally, a new regularized logistic regression method based on the evolution of the regression coefficients using estimation of distribution algorithms is presented.
The main novelty is that it avoids the determination of the regularization term. The chosen simulation method of new coefficients at each step of the evolutionary process guarantees their shrinkage as an intrinsic regularization.
When solving clinical decision-making problems with modern graphical decision-theoretic models such as influence diagrams, we obtain decision tables with optimal decision alternatives describing the best course of action for a given patient or group of patients. For real-life clinical problems, these tables are often extremely large. This is an obstacle to understanding their content. KBM2L lists are structures that minimize memory storage requirements for these tables, and, at the same time, improve their knowledge organization. The resulting improved knowledge organization can be interpreted as explanations of the decision-table content. In this paper, we explore the use of KBM2L lists in analyzing and explaining optimal treatment selection in patients with non-Hodgkin lymphoma of the stomach using an expert-designed influence diagram as an experimental vehicle. The selection of the appropriate treatment for non-Hodgkin lymphoma of the stomach is, as for many other types of cancer, difficult, mainly because of the uncertainties involved in the decision-making process. In this paper we look at an expert-designed clinical influence diagram as a representation of a body of clinical knowledge. This diagram can be analyzed and explained using KBM2L lists. It is shown that the resulting lists provide high-level explanations of optimal treatments for the disease. These explanations are useful for finding relationships between groups of variables and treatments. It is demonstrated that these lists can act as a basis for gaining a deeper understanding of the underlying clinical problem.