The S^{3} seminar is a bi-weekly meeting organised by the software science group of iCIS. The seminar is open to everyone interested. It is normally held on Tuesday at 11:30-12:30. Please contact Jurriaan Rot if you'd like (someone) to give a talk!

**Rick Smetsers**

June 13 2017, 11:30-12:30, room HG01.057.*Grammatical inference as a satisfiability modulo theories problem*.

The problem of learning a minimal consistent model from a set of labeled sequences of symbols is addressed from a satisfiability modulo theories perspective. We present two encodings for deterministic finite automata and extend one of these for Moore and Mealy machines. Our experimental results show that these encodings improve upon the state-of-the-art, and are useful in practice for learning small models.

**Marcos Bueno**

May 23 2017, 11:30-12:30, room HG00.308.*Comparing approaches for probabilistic model-based diagnosis*.

In consistency-based diagnosis (CBD), faults are identified based on deviations from a normal behavior specification. This is suitable e.g. for new artifacts such as printers, trains, etc. Probabilistic approaches have been added into CBD for aiding the selection of a diagnosis over a potentially intractable number of candidates, thus providing means to rank diagnoses. Under partial observability, however, the interpretation of diagnostic methods is non-trivial or even misleading (e.g. the conflict measure by Flesch and Lucas).

In this work, we propose a new diagnostic measure that is able to better handle the partial observability scenario. It is based on the intuition that the simplest diagnosis should be preferred, unless there is strong evidence in favor of more complex diagnoses (in terms of #abnormalities). Moreover, we suggest that rejecting a diagnosis is non-trivial, because the involved probabilities might be very small (e.g. multiple components are observed), and also because distinguishing unlikely observations from rare observations is intuitively important. As a result, additional computational costs are introduced, which shall be discussed as well.

Joint work with Arjen Hommersom (RUN, Open University) and Peter Lucas (RUN, Leiden University).

**Jurriaan Rot**

May 9 2017, 11:30-12:30, room HG00.308.*Commutative Kleene Algebra*.

The axioms of Kleene algebra are complete for language equivalence of regular expressions. In this talk I will discuss commutative Kleene algebra, where the multiplication (concatenation) operation is commutative. Combining various results from the literature, I will show that this is complete for regular expressions interpreted on languages over a singleton alphabet.

This is joint work with Damien Pous (ENS Lyon).

**Peter Achten**

April 25 2017, 11:30-12:30, room HG00.058.*Towards the Layout of Things*.

When writing a user interface (UI), the layout of its elements play an important role. Programmers should be able to specify the layout of UIs in an intuitive way, while being able to separate the concern of laying out the UI from the rest of the software implementation. Ideally, the same layout language can be used in multiple application domains, so the programmer only has to learn one set of layout concepts. In this paper we introduce such a general-purpose layout language. We obtain this language by abstracting from a layout language we have introduced in previous work for declaratively defining Scalable Vector Graphics (SVG). We show that this abstract layout language can be instantiated for multiple domains: the SVG library by which the language is inspired, ncurses-based text-based user interfaces, and iTasks. In all of these cases, a separation of concerns is maintained.

This is joint work with Jurriën Stutterheim, Bas Lijnse, Rinus Plasmeijer; presented at IFL 2016.

**Henning Basold**

March 28 2017, 11:30-12:30, room HG00.058.*Mixed Inductive-Coinductive Reasoning: Types, Programs and Logic*.

In this talk, I will give a high-level overview over the content of my thesis, which is concerned with programming of and reasoning about inductive-coinductive systems.

The classical notion of algorithm assumes that inputs are finite and that computations eventually terminate with an output. For example, sorting algorithms take (finite) lists as input and output a sorted list after a finite number of computation steps. Due to their finite structure, lists are considered as so-called inductive data. Moreover, sorting algorithms usually operate by sorting parts of the list and combine these parts, so that they are defined by iteration on the input. Associated to inductive data is the principle of induction, which allows us to prove properties of programs by induction on their input. Both iteration and induction are well-studied and ubiquitous principles in mathematics and computer science.

On the other hand, many systems and computations are assumed to run indefinitely, while still reacting to their environment. Examples of this are control systems, deterministic finite automata that recognise regular languages, and computations on real numbers. Such systems can be described by coiteration, which usually arises from operational semantics. The associated proof principle is consequently called coinduction and allows us to show that two systems exhibit the same behaviour. Also here, coinduction is well-known in form of the bisimulation proof principle in, at least, computer science.

Finally, in many cases we need to combine both types of computation, which leads to inductive-coinductive structures and reasoning principles. For instance, so-called liveness properties of reactive systems, that is, properties that something needs to happen eventually, is are mixed inductive-coinductive predicates. Here, we then need to combine both iteration and coiteration, and induction and coinduction for programming and reasoning.

All of this raises the need for programming languages that allow us to manipulate finite and infinite data simultaneously, and for principles to reason about such programs. The programs will be given in a typed language, thereby allowing us to make basic guarantees about correctness and termination. On the other hand, the reasoning principles will enable us to prove properties of the programs' behaviour. Preferably, these reasoning principles lead to proofs that can be checked automatically by a computer, so to avoid mistakes in proofs.

**Paul Fiterau-Brostean**

March 14 2017, 11:30-12:30, room HG00.308.*Model Learning and Model Checking of SSH Implementations*. Slides,model

We apply model learning on three SSH implementations to infer state machine models, and then use model checking to verify that these models satisfy basic security properties and conform to the RFCs. Our analysis showed that all tested SSH server models satisfy the stated security properties. However, our analysis uncovered minor violations of the standard.

**Hans Zantema**

February 14th 2017, 11:30-12:30, room HG00.308.*Comparing infinite sequences by finite state transducers*. Slides

We want to compare and relate (infinite) sequences. A natural way to do so is by finite state transducers, that is, finite deterministic automata with output. If feeding a sequence A to a finite state transducer yields a sequence B as output, then we write A >= B. Roughly speaking, this means that A can be transformed to B by a program with finite memory. In this pre-order the smallest sequences are the ultimately periodic sequences. The structure of sequences with respect to this pre-order raises many questions, most of which are open. A sequence A is called atomic if for every sequence B with A >= B we either have B >= A or B is ultimately periodic. The sequence 1101001000100... in which the sizes of consecutive groups of zeros are consecutively 0,1,2,3,4,.. is known to be atomic. Adding or removing any finite part remains equivalent with respect to this pre-order. If we want a more fine-grained pre-order, we may switch to permutation transducers, that is for every state and symbol there is not only exactly one corresponding outgoing arrow, but also exactly one corresponding incoming arrow. Recent work with Wieb Bosma shows some remarkable results on the corresponding pre-order, also on two-sided infinite sequences.

**Markus Klinik**

January 24th 2017, 11:30-12:30, room HG00.058.*Predicting Resource Consumption of Higher-Order Workflows*

We present a type and effect system for static analysis of programs written in a simplified version of iTasks. iTasks is a workflow specification language embedded in Clean, a general-purpose functional programming language. Given costs for basic tasks, our analysis calculates an upper bound of the total cost of a workflow. The analysis has to deal with the domain-specific features of iTasks, in particular parallel and sequential composition of tasks, as well as the general-purpose features of Clean, in particular let-polymorphism, higher-order functions, recursion and lazy evaluation. Costs are vectors of natural numbers where every element represents some resource, either consumable or reusable.

**Joshua Moerman**

January 10th 2017, 11:30-12:30, room HG00.308.*Learning Nominal Automata*. Slides

I will present an Angluin-style algorithm to learn nominal automata. These automata are acceptors of languages over infinite (structured) alphabets. Learning such automata is interesting from a theoretical perspective, but also from a more applied perspective. In fact, this research was motivated by the problem of learning register automata which are used in verification and testing of reactive systems. In my talk I will focus on nominal automata themselves and what it means to learn (nominal) automata.

This is joint work with Matteo Sammartino, Alexandra Silva, Bartek Klin and Michał Szynwelski.

We consider the task of constructing 2-3 trees. Given a sequence of elements we seek to build a 2-3 tree - in linear time - that contains the elements in symmetric order. We discuss three approaches: top-down, bottom-up, and incremental. The incremental approach is more flexible than the other two in that it allows us to interleave the construction work with other operations, for example, queries.

**Marcos Bueno**

June 11th, 12:30-13:15, room HG00.065.*Extending hidden Markov models with asymmetric conditional independences*

In discrete-time temporal processes, hidden Markov models (HMMs) are widely used to capture multiple distributions over a feature space via hidden variables. As HMMs are within the family of Bayesian networks models, their graphical structure represents only conditional independence statements that must be valid for every instantiation of the variables involved. On the other hand, representing asymmetric independences explicitly can reveal additional characteristics of the nature of the underlying process, while producing more compact models. In this talk, we discuss a model developed to represent asymmetric independences in HMMs as a function of the hidden state. Experiments on a medical problem of patients under psychiatric treatment are shown, and some advantages of the method are highlighted and compared with other methods, namely, a simple and first-second order tree-based HMMs.

**Joshua Moerman**

June 4th, 12:30-13:15, room HG00.065.*The zoo of FSM-based test methods*

Algorithms for active automata learning require an equivalence oracle. Such an oracle answers affirmative if a hypothesis is correct, and answers with an counter example if not. In practice we face the problem that no such oracles exist, so we have to use (model based) testing to search for a counter example. For a case study from Océ (>3000 state Mealy machine!) we need a pretty good test method which generates a small but complete (in some sense) test suite. I will give an elegant overview of the classical methods and describe a new algorithm we used for the Océ case study. This talk is (hopefully) interesting for people who like testing, automata theory or algorithms.

**Rick Smetsers**: Minimal separating sequences for all pairs of states.

April 23rd, 12:30-13:15, HG00.308.

We describe an algorithm for finding minimal separating sequences for all pairs of inequivalent states of a Mealy machine. Our algorithm identifies the distinct equivalence classes of the machine by iteratively refining a partition of its states. In each iteration of the algorithm, a block of states in the partition is split based on the observation that for a certain input certain states in the block end up in a different block than others. We use this information to construct a sequence of minimal length for which states in one block produce different output that states in the other. Such minimal separating sequences are useful in testing an implementation of a machine with respect to its specification.

**Pieter Koopman***Church Encoding of Data Types Considered Harmful for Implementations.*

April 16th, 12:30-13:15, Room HG00.065.

From the lambda-calculus it is known how to represent (recursive) data structures by ordinary lamvda-terms. Based on this idea one can represent algebraic data types in a functional programming language by higher-order functions. In this talk we compare the famous Church encoding of data types with the less familiar Scott and Parigot encodings.

We show that one can use the encoding of data types by functions in a Hindley-Milner typed language by adding a single constructor for each data type. By collecting the basic operations of a data type in a type constructor class and providing instances for the various encodings, these encodings can coexist in a single program.

We show that in the Church encoding selectors of constructors yielding the recursive type, like the tail of a list, have an undesirable strictness in the spine of the data structure. The evaluation of the recursive spine by the Church encoding makes the complexity of these destructors linear time. The same destructors in the Scott and the Parigot encoding requires only constant time and do not hamper lazy evaluation in any way. Moreover, the Church encoding has problems with sharing reduction results of selectors.

The Parigot encoding is a combination of the Scott and Church encoding. Hence, we might expect that it combines the best of both worlds. In practice it does not offer any advantage over the Scott encoding.

**Peter Achten**

April 9th, 12:30-13:15., Room HG00.065*What is Ligretto? A case study in compositional and functional programming.*

How should one define interactive programs in a functional style? This question has been answered many times in many different ways. In this talk we demonstrate a new way of answering this question. Our approach is based on old insights, the most important of which is that a solution must be *purely compositional* which has resulted in Graphics.Scalable, a compositional library for vector graphics based on SVG. The approach also builds on new insights, the most important of which is that behaviour should be expressed as tasks in the Task Oriented Programming paradigm.

- April 2th, 12:30-13:15, Room HG01.029.
**Markus Klinik**: Refactoring a warship: How hard can it be?

The Manning and Automation Project is an effort by the Dutch navy to reduce manning on some ship classes while retaining their full functionality. Furthermore, considering political and economical developments, it is expected that future ships will be employed in increasingly complex and versatile missions. This calls for new Command and Control systems which support the reduced crew in the execution of their missions. In this setting, we are studying so-called resilient workflows. The goal is to build socio-technical systems, that is teams of humans and machines, which get their work done even if subsystems break down, crew members become unavailable, or unexpected situations disrupt standard procedures. In particular, we present ongoing work on calculating overall resource requirements in situations where systems transitively depend on each other in the presence of redundancy.

- March 26th, 12:30-13:15, Room HG00.065. CANCELED!!!
**Joost Visser**: Test Code Quality and Its Relation to Issue Handling Performance.

We created a rating model for test code quality that employs metrics to quantify completeness, effectiveness, and maintainability of test code. The model was calibrated against a benchmark of open source systems. The model was validated by empirically assessing the relation between test code quality, as measured by the model, and issue handling performance, as determined by mining the issue tracking systems associated to 18 open source systems.

Joint work with Dimitrios Athanasiou, Ariadi Nugroho, and Andy Zaidman. http://dx.doi.org/10.1109/TSE.2014.2342227

- March 12th, 12:30-13:15, Room: HG00.065.
**Steffen Michels**: Approximate Probabilistic Inference with Guaranteed Error for Hybrid Problems with Rich Logical Structure

Most recent AI models that handle uncertainty are based on probability theory. Exact probabilistic inference algorithms for such models are able to exploit their probabilistic and logical structure. To handle continuous distributions one usually uses an approximate algorithm. Approximate algorithms have less capability of exploiting the logical structure of a problem and work poorly for rare events. However, their main drawback is that no guarantees about the quality of their results can be given. In this paper we present a new approximate inference method, yielding approximations of discrete and continuous distributions with a guaranteed maximal error. A comparison is made with sampling methods, showing experimentally that our method is at least competitive and sometimes superior for hybrid models that possess logical structure and express rare events.

- March 19th, 12:30-13:15, Room HG00.065.
**David Jansen**: A space-efficient probabilistic simulation algorithm

In the context of probabilistic automata, time-efficient algorithms for probabilistic simulations have been proposed lately. The space complexity thereof is quadratic in the size of the transition relation, thus space requirements often become the practical bottleneck. In this paper, we propose a spae-efficient algorithm for computing probabilistic simulations based on partition refinement. Experimental evidence is given showing that not only the space efficiency is improved drastically: The experiments often require orders of magnitude less time. In practice, they are even faster tha the time-optimal algorithm by Crafa/Ranzato.

- February 19th, 12:30-13:15, Room: HG00.065
**Frits Vaandrager**: Learning Nondeterministic Register Automata Using Mappers

We present a new algorithm for active learning of register automata. Our algorithm uses counterexample-guided abstraction refinement to automatically construct a component which maps (in a history dependent manner) the large set of actions of an implementation into a small set of actions that can be handled by a Mealy machine learner. The class of register automata that is handled by our algorithm extends previous definitions since it allows for the generation of fresh output values. This feature is crucial in many real-world systems (e.g. servers that generate identifiers, passwords or sequence numbers). We have implemented our new algorithm in a tool called Tomte.

- February 13th, 12:30-13:15, Room: HG00.065
**Erik Proper**: Hello again! Modelling in Luxembourg

This presentation will discuss the modelling related research taking place in Luxembourg. In 2010 I moved from the Netherlands to Luxembourg to set up a
research team at the Tudor research centre. Essentially the Luxembourgish version of TNO/VTT/Fraunhofer. During the presentation I will briefly review the developments of the past years in the Luxembourgish context, as well as where we stand at the moment within the Luxembourgish landscape. In doing so, I will also position the existing collaborations with the Netherlands, including e.g. Radboud University and the HAN. I will also specifically highlight key research topics in which further collaboration between "Luxembourg" and MBSD could be initiated, as well as potential topics that might be addressed in future seminars.

About the speaker: Erik Proper is Adjunct Professor within the MBSD Department of iCIS. He chairs the Enterprise Engineering Team combining about 35 researchers from Luxembourg and the Netherlands. He is also Deputy Director of the IT for Innovative Services Department of the newly formed Luxembourg Institute for Science and Technology (LIST). The LIST was formed out of a merger of two public research centres: CRP Henri Tudor and CRP Gabriel Lippmann. The ITIS department currently involves about 110 researchers and engineers.

- Januari 22th, 12:30-13:15, room: HG00.065
**Maarten van der Heijden**: Causal Independence Models for Continuous Time Bayesian Networks

The theory of causal independence is frequently used to fa- cilitate the assessment of the probabilistic parameters of probability dis- tributions of Bayesian networks. Continuous time Bayesian networks are a related type of graphical probabilistic models that describe stochastic processes that evolve over continuous time. Similar to Bayesian networks, the number of parameters of continuous time Bayesian networks grows exponentially in the number of parents of nodes. In this talk, we look at causal independence in continuous time Bayesian networks. This new theory can be used to significantly reduce the number of parameters that need to be estimated for such models as well. We show that the noisy-OR model has a natural interpretation in the parameters of these models, and can be exploited during inference. Furthermore, we gener- alise this model to include synergistic and anti-synergistic effects, leading to noisy-OR synergy models.

- Januari 15th, 12:30-13:15, room: HG00.065
**Arjen Hommersom**: Identifying States in Probabilistic Automata

Probabilistic automata are used to model processes as sequences of events or actions. In contrast, temporal Bayesian networks model sequences of states. In the NWO CAREFUL project, we aim to reconcile these two approaches in order to unravel care given to patients with particular illnesses. In this talk, ongoing work on the identification of states in probabilistic automata is discussed. An algorithm is proposed that makes use of the regular structure within these models. Furthermore, we show preliminary results where we apply this algorithm to data of patients who were treated for psychotic depression.

- December 18th, 12:30-13:15, room: HG00.065
**Manxia Liu**: Hybrid Time Bayesian Networks

Dynamic systems are relevant to many fields, such as financial analysis, speech recognition, and medical diagnosis. Dynamic Bayesian networks have been developed to model dynamic systems by discretizing time with a fixed time interval. The finest time granularity leads to inflexibility when temporal processes in a dynamic system evolve at different rates. An alternative for modelling dynamic systems are continuous time Bayesian networks, where every random variable evolves at its own rate. Continuous time Bayesian networks are suitable for dealing with systems with multiple time granularities. However, the representation in continuous time Bayesian networks is less suitable for incorporating and validating probabilistic expert knowledge. In this paper, we present a novel formalism, hybrid time Bayesian networks, which combines discrete-time and continuous-time Bayesian networks and is suitable for dynamic systems with multiple time granularities. Also, we apply hybrid time Bayesian networks to predict myocardial infarction.

- December 11th, 12:30-13:15, room: HG00.065
**Jurrien Stutterheim**: Task Oriented Programming with Purely Compositional Interactive Scalable Vector Graphics

iTasks enables the rapid creation of multi-user web-applications. It enables rapid development by automatically generating form-based graphical user interfaces (GUIs) for any first-order type. In some situations, however, form-based GUIs are not sufficient and more complex user interfaces are needed instead. In this talk, we introduce a purely compositional library for creating interactive user interface components, based on Scalable Vector Graphics (SVG). Not only are all images purely compositional, interaction on them is specified by pure functions. The graphics library is integrated with iTasks in such a way that one can easily switch between the generic form-like GUIs and graphics-based user interfaces.

- December 4th, 12:30-13:15, room: Mercator 1, 00.02
**Paul Fiterau-Brostean**: Learning fragments of the TCP communication protocol

In recent work, we applied active learning techniques to learn fragments of the TCP communication protocol by observing its external behavior. We show that the different implementations of TCP tested (Windows 8 and Linux) induce different models, thus allowing for fingerprinting of these implementations. Moreover, the models show that the specification (rfc 793) is not met fully met. We will also show how results from this case study reflect on current work on extending learning techniques so that learning can be done automatically.

- September 5th, 15.30-16.30, room: HG00.065
**Hossein Amirkhani**: Improving Bayesian Network Structure Learning using Heterogeneous Experts.

We consider the problem of learning Bayesian network structures by exploiting both data as well as experts’ opinions about the graph. In practice, experts will have different individual probabilities of correctly labelling the inclusion or exclusion of edges in structure of the network. Therefore, we propose in this paper to estimate the accuracy of experts and then exploit this accuracy during the learning of the structure. We use an expectation maximization (EM) algorithm to estimate the accuracies, considering the true structure as the hidden variable. As a second contribution, we develop a Bayesian score that considers the training data as well as the experts’ opinions to score different possible structures, and then a greedy search is done in the space of possible structures.

- July 4th, 15.30-16.30, room: HG01.058
**Rick Smetsers & Michele Volpato**: Bigger is Not Always Better: on the Quality of Hypotheses in Active Automata Learning.

In Angluin’s L* Algorithm a learner constructs a sequence of hypotheses in order to learn a regular language. Each hypothesis is consistent with a larger set of observations and is described by a bigger model. From a behavioral perspective, however, a hypothesis is not always better than the previous one, because the length of a shortest counterexample that distinguishes a hypothesis from the target language may decrease. We present a simple modification of the L* algorithm that ensures that for subsequent hypotheses the length of a shortest counterexample never decreases, which implies that the distance to the target language never increases in a corresponding ultrametric. Preliminary exponential evidence suggests that our algorithm speeds up learning in practical applications by reducing the number of equivalence queries.

- June 27th, 15.30-16.30, room: HG01.058
**Marcos Bueno**: A Temporal Model for Identification of Bayesian Network-based States.

In clinical guidelines a sequence of actions can be devised to achieve a certain goal, for example, to determine how to treat a patient over a period of time. In practice, not all patients are handled according to guidelines, suggesting that uncertainty may exist in some degree. There exist datasets that track actual treatments, which can be used as a source for further construction of data-based models and then compared with protocol-based models. We propose a model constructed from actual care data that acts as a probabilistic view of protocols, in which patients belonging to some state of the model have a set of possible future states. In order to allow white-box insights about patients’ temporal evolution, we model states as arbitrary Bayesian networks over a set of given random variables. Thus, probabilistic relationships among variables of interest can be shown through time. In this presentation this problem will be described followed by a brief description of an initial proposal for the learning of the model from data.

- May 16th, 15.30-16.30, room: HG00.108
**Jan Tretmans**: Quality and Testing of Embedded Systems; or: What do Dykes, De Ruyter, and Wafer Scanners have in common.

High-tech systems increasingly depend on software: software controls, connects, and monitors almost every aspect of modern system operation. Therefore, overall system quality and reliability are more and more determined by the quality and reliability of the embedded software. This presentation will discuss some trends, issues, and challenges related to system quality and testing.

- May 2nd, 15.30-16.30, room: HG02.028
**Hossein Amirkhani**: Expectation maximization based ordering aggregation for improving the K2 structure learning algorithm

Some of the basic algorithms for learning the structure of Bayesian networks, such as the well-known K2 algorithm, require a prior ordering over the nodes as part of the input. It is well known that the accuracy of the K2 algorithm is highly sensitive to the initial ordering. In this presentation, we introduce the aggregation of ordering information provided by multiple experts to obtain a more robust node ordering. In order to reduce the effect of novice participants, the accuracy of each person is used in the aggregation process. The accuracies of participants, not known in advance, are estimated by the expectation maximization algorithm. Any possible contradictions occurred in the resulting aggregation are resolved by modelling the result as a directed graph and avoiding the cycles in this graph. Finally, the topological order of this graph is used as the initial ordering in the K2 algorithm. The experimental results demonstrate the effectiveness of the proposed method in improving the structure learning process.

- April 25th, 15.30-16.30, room: HG02.702
**Liang Zou (Chinese Academy of Sciences)**: Modeling and Verifying Stochastic Hybrid Systems in Simulink/Stateflow

Simulink/Stateflow is commonly used in industry for designing, simulating, and analyzing control systems. It has good support from Matlab for computation, however, it is lacking support for stochastic modeling. In this talk, we will first discuss about how to bring stochastic choice and stochastic timing in Stateflow, and then discuss how statistical model checking techniques can be used for analyzing stochastic Simulink/Stateflow diagrams.

- March 28th, 15.30-16.30, room: HG 01.028
**Peter Achten**: A Comparison of Task Oriented Programming with GUIs in Functional Languages.

In this talk we compare the expressiveness of the Task Oriented Programming iTask approach of specifying interactive GUI applications with ObjectIO and Racket bigbang. ObjectIO is representative for the large class of traditional desktop widget based toolkits aiming to provide the programmer with full access the underlying GUI toolkit in a functional style. In contrast, the Racket bigbang approach offers the much more restricted setting of a single window and canvas to which the programmer adds callback and image rendering functions in a pure functional style. We demonstrate that both the Racket bigbang and iTask approaches result in significantly smaller GUI specifications by means of a small case study of the game of tic-tac-toe.

- January 17th, 15.30-16.30, room: HG 02.028
**Pieter Koopman**: Model-Based Shrinking for State-Based Testing.

Issues found in model-based testing of state-based systems are traces produced by the system under test that are not allowed by the model used as specification. It is usually easier to determine the error behind the reported issue when there is a short trace revealing the issue. Our model-based test system treats the system under test as a black box. Hence the test system cannot use internal information from the system under test to find short traces. This paper shows how the model can be used to systematically search for shorter traces producing an issue based on some trace revealing an issue.

- January 10th, 15:30-16.30, room: HG 02.028
**Juan Guadarrama (University of Mexico State)**: Theory and Foundations for a Knowledge Base Debugger and other Applications.

Knowledge representation is an important topic in common-sense reasoning and Artificial Intelligence, and one of the earliest techniques to represent it is by means of knowledge bases encoded into logic clauses. Encoding knowledge, however, is prone to typos and other kinds of consistency mistakes, which may yield incorrect results or even contradictions with conflicting information from other parts of the same code. In order to overcome such situations, I propose a logic framework to debug knowledge bases that might be systems specifications. The approach has a strong theoretical framework on knowledge representation and reasoning, and a suggested on-line prototype where one can test logic programs. Such knowledge bases may have conflicting information and the framework can be used to locate the possible source of conflict. Besides, the system can be employed to identify conflicts of the knowledge base itself and upcoming new information, it can also be used to locate the source of conflict from a given inherently inconsistent static knowledge base. This talk is about an implementation of a declarative version of a future system that has been characterised to debug knowledge bases in a symbolic formalism, which can be generalised to other fields of research like abductive reasoning, probabilistic reasoning, diagnosis and software development, amongst others.

Keywords-Answer Set Programming; Dynamic Knowledge Representation; theory transformation; non-monotonic reasoning; belief revision; reasoning; (weak) constraints; high-order theories.

- December 13th, 15.30-16.30, room: HG01.058
**Tim Willemse (TU Eindhoven)**: From Preservation to Approximation of Solutions to Parameterised Boolean Equation Systems.

Parameterised Boolean Equation Systems (PBESs) are sequences of least and greatest fixpoint equations ranging over first-order logic formulae. Such equation systems are used to encode various verification problems, including model checking problems and equivalence checking problems. In this presentation, we focus on some recent developments in the theory for PBESs which have proven to be instrumental in developing techniques for simplifying PBESs prior to solving them. Among others, we describe a theory of abstraction for PBESs and compare this theory to existing abstraction theories for transition systems.

- October 11th, 15.00-16.00, room: HG01.060
**Martijn van Otterlo (AI/Donders)**: Modeling Relations between Reinforcement Learning, Visual Input, Perception and Action.

My talk is about a possibilities for relational integration of both relational action and vision. The real potential of relational representations is that states can \emph{share} information with actions (e.g. parameters, or more specifically \emph{objects}). Possibilities exist to define novel languages for \emph{interactive experimentation} with relational abstraction levels in the context of both complex visual input and complex behavioral output. This includes new types of interactions -- for example dealing with scarce human feedback, new types of experimentation -- for example incorporating visual feedback and physical manipulation, and new types of abstraction levels -- such as probabilistic programming languages. I will present first steps towards a more tight integration of relational vision and relational action for interactive learning settings. In addition I present several new problem domains. .

- October 4th, 15.00-16.00, room: HG00.068
**Jesse Davis (KU Leuven, Belgium)**: Designing Machine Learning Algorithms for Clinical Data.

With the widespread adoption of electronic health/medical records (EHRs/EMRs) and continuing improvements in data collection, electronic health data is growing at a staggering rate. An emerging area of research tries to use this data, in combination with domain knowledge, to build models (e.g., predictive) to help with tasks such as clinical decision support. My research revolves around designing artificial intelligence and machine learning tools that can help clinicians and medical researchers analyze, interpret, and exploit the burgeoning collection of data and knowledge. However, analyzing such data poses significant technical challenges for learning and reasoning due to their relational schemas, longitudinal nature, the presence of latent structure, and the fact that different patients may have dramatically different numbers of entries in any given table, such as diagnoses or vitals. Furthermore, it is important to model the uncertain, non-deterministic relationships between patients' clinical histories and current and future predictions about their health status.

In this talk, I will present some of my research on statistical relational learning, which is able to effectively address these challenges. In the first part of the talk, I will focus on using structurd mammography reports to predict whether an abnormality is malignant. I will describe an algorithm for automatically constructing features from the data that leads to improved performance on a real-world dataset compared to both other machine learning algorithms and radiologists. In the second part of the talk, I will focus on predicting whether a patient will have an adverse reaction (ADR) to a medication. In particular, I will focus on the fact that latent information may be necessary to build accurate models. For example, a patient's clinical history records information about specific prescribed medications or specific disease diagnoses. It does not explicitly mention important connections between different medications or diagnoses, such as which other medications could have been prescribed to treat an illness, that may be relevant. I will describe an algorithm that, while learning a model, automatically discovers clusters of diseases or medicines that are informative for the specific prediction task. This leads to improved performance on three real-world ADR tasks.

- September, 20th, 15.00-16.00, room: HG01.060
**Olga Grinchtein (Ericsson, Sweden)**: Model-based testing for LTE Radio Base Station.

The presentation describes experiences of applying model-based testing to LTE Radio Base Station at Ericsson. The "LTE", Long Term Evolution, is the global standard for the fourth generation of mobile networks. The presentation describes what kind of problems we faced during modelling, test generation and concretization. We provide a comparison between two model-based testing tools, based on experience of modelling and test generation of LTE functionality.

- September 13th, 15.00-16.00, room: HG02.702
**Giso Dal**: A highly parallel bounding algorithm for computing the exact diameter of large real-world sparse graphs.

The eccentricity of a node in a graph is defined as the length of a longest shortest path starting at that node. The eccentricity distribution over all nodes is a relevant descriptive property of the graph, and its extreme values allow the derivation of measures such as the radius, diameter, center and periphery of the graph. The diameter is defined as the length of the longest shortest path over all vertices in the graph, i.e. the smallest eccentricity, and serves as a relevant property of all types of graphs that are nowadays frequently studied. Examples include social networks, webgraphs and routing networks. During this talk we propose a highly parallel algorithm based on eccentricity lower and upper bounds for computing the exact diameter of large real-world sparse graphs.

- August, 16th, 15.00-16.00, room: HG02.028
**David Jansen**: Revisiting Weak Simulation for Substochastic Markov Chains.

The spectrum of branching-time relations for probabilistic systems has been investigated thoroughly by Baier, Hermanns, Katoen and Wolf (2003, 2005), including weak simulation for systems involving substochastic distributions. Weak simulation was proven to be sound w. r. t. the liveness fragment of the logic PCTL\X, and its completeness was conjectured. We revisit this result and show that soundness does not hold in general, but only for Markov chains without divergence. It is refuted for some systems with substochastic distributions. Moreover, we provide a counterexample to completeness. In this paper, we present a novel definition that is sound for live PCTL\X, and a variant that is both sound and complete.

This presentation is a practice talk for my QEST conference publication at the end of August. (Well, if Volkswagen doesn’t prohibit it.)

- June, 28th, 15.00-16.00, room: HG02.028
**Martijn Lappenschaar**: Qualitative chain graphs. - June, 21st, 15.00-16.00, room: HG01.028
**Sicco Verwer**: White-box optimization from historical data.

One of the main challenges when applying mathematical optimization is to construct a mathematical model describing the properties of a system. In this paper, we demonstrate how to learn such a model entirely from data using methods from machine learning. We illustrate our approach using the following optimization problem: ordering the resources in sequential auctions as to maximize the social welfare. This social welfare depends on combinatorial preferences and budget constraints of the bidders. Since these are unknown to the auctioneer, we propose to learn them from historical data. Given the learned model, we propose a novel white-box method to find an optimal ordering. Specifically, we take the outcome of a regression tree and use several constructions to map it into integer linear programs that can be used for optimization. We use an auction simulator and design several experiments to test the performance of the proposed method. Our experiments show that optimization based on historical data results in high social welfare. Furthermore, we compare our white-box approach with a black-box best first search approach and show its advantages over black-box method.

- June, 7th, 15.00-16.00, Room: HG01.060
**Alexandra Silva**: CoCaml: Programming with Coinductive Types

We present CoCaml, a functional programming language extending OCaml, which allows us to define functions on coinductive datatypes parameterized by an equation solver. We provide numerous examples that attest to the usefulness of the new programming constructs, including operations on infinite lists, infinitary lambda-terms and p-adic numbers.

(Joint work with Jean-Baptiste Jeannin and Dexter Kozen, Cornell University)

- May, 17th, 15.00-16.00, room: HG01.060
**Steffen Michels**: Inference for a New Probabilistic Constraint Logic

Probabilistic logics combine the expressive power of logic with the ability to reason with uncertainty. Several probabilistic logic languages have been proposed in the past, each of them with their own features. In this paper, we propose a new probabilistic constraint logic programming language, which combines constraint logic programming with probabilistic reasoning. The language supports modeling of discrete as well as continuous probability distributions by expressing constraints on random variables. We introduce the declarative semantics of this language, present an exact inference algorithm to derive bounds on the joint probability distributions consistent with the specified constraints, and give experimental results. The results obtained are encouraging, indicating that inference in our language is feasible for solving challenging problems.

- April, 26th, 15.00-16.00, room: HG00.310
**Peter Novak (TU Delft)**: Reconfiguration of Large-Scale Surveillance Systems

Metis project aims at study of techniques supporting development of large-scale dependable surveillance system-of-systems for maritime safety and security. Surveillance systems, such as Metis, typically comprise a number of heterogeneous information sources and information aggregators. Among the main problems of their deployment lies scalability of such systems with respect to a potentially large number of monitored entities. One of the solutions to the problem is continuous and timely adaptation and reconfiguration of the system according to the changing environment it operates in. At any given timepoint, the system should use only a minimal set of information sources and aggregators needed to facilitate cost-effective early detection of indicators of interest.

On the background of Metis prototype description, I will introduce a theoretical framework for modelling scalable information-aggregation systems. We model such systems as networks of inter-dependent reasoning agents, each representing a mechanism for justification/refutation of a conclusion derived by the agent. The algorithm facilitating continuous reconfiguration is based on standard results from abstract argumentation and corresponds to computation of a grounded extension of the argumentation framework associated with the system.

- April, 5th, 15.00-16.00, room: HG03.632
**Bas Lijnse**: Task Oriented Programming (TOP) for incident response applications.

In my dissertation work I have had the opportunity to study the search and rescue operations of the Netherlands Coast Guard. Examples from this domain have largely influenced the evolution of the Task-Oriented Programming (TOP) paradigm which was the primary focus of my research. In my current post-doc project, the SENECA/Incidone project, we will continue our collaboration. The goal of this project is to develop and evaluate a small, yet fully operational, incident coordination tool based on our previous findings. This will allow us to test our TOP framework iTasks beyond the usual lab setting, and to develop it into a more robust basis for further TOP research. The talk will sketch the origins, the goals, the requirements and preliminary design of the Incidone tool we are working on.

- EXTRA SEMINAR March, 28th, 15.00-16.00
**Prof dr. Mark Haselkorn (University of Washington, Seattle)**: Integrated Modeling of Work and Information Flow to Close the Gap between Human-Centered Design and System Development.

- March, 22nd, 15.30-16.30, room HG00.303
**Eelco Visser (TU Delft)**: The Language Designer's Workbench: State-of-the-art and research agenda.

Software systems are the engines of modern information society. Our ability to cope with the increasing complexity of software systems is limited by the programming languages we use to build them. Bridging the gap between domain concepts and the implementation of these concepts in a programming language is one of the core challenges of software engineering. Modern programming languages have considerably reduced this gap, but still require low-level programmatic encodings of domain concepts. Domain-specific software languages (DSLs) address this problem through _linguistic abstraction_ by providing notation, analysis, verification, and optimization that are specialized to an application domain. Language workbenches assist language designers with the _implementation_ of languages. However, DSL implementations rarely formally guarantee semantic correctness properties such as type soundness and behavior preservation of transformations, since current language workbenches provide no support for such verification, which leads to intricate errors in language designs that are discovered late.

- March, 8th, 15.30-16.30, room HG01.028
**Martijn Lappenschaar**: Learning quantitative and qualitative aspects of multimorbidity with probabilistic graphical models.

Multimorbidity, i.e., the presence of multiple diseases within one person, is a significant health-care problem for western societies.

Diagnosis, prognosis and treatment in the presence of multimorbidity can be complex due to the various interactions between diseases.

To better understand the co-occurrence of diseases, we used several types of probabilistic graphical modeling.

In this talk we will elaborate on three aspects of our research.

1) Multilevel Bayesian networks were used to analyze co-occurrence of diseases in case of hierarchical health data.

This is illustrated in the cardiovascular area, showing synergistic effects between diseases, even when controlled for specific health risks.

2) A novel measure, based on the d-separation criteria in Bayesian networks, was proposed to identify the critical factors between two diseases.

This is illustrated in the oncological area for better understanding co-occurrences of malignant tumors.

3) The theory of qualitative probabilistic networks was extended to chain graphs to explore the dynamics of disease interactions with no natural direction.

- Februari 8th, 15.30-16.30
**John van Groningen**: Adding Functional Dependencies to Multi-Parameter Type Classes in Clean

In this talk I will explain what functional dependencies for multi-parameter type classes are and show some examples. This has been implemented in an experimental version of the Clean compiler. How this was added to a compiler that also supports uniqueness typing is explained. Termination problems of the context reduction algorithm with functional dependencies are discussed. A terminating algorithm that permits more type level computations like addition of Peano numbers is presented.

- Januari 21st, 15.30-16.30
**Alexandre David (Aalborg University Denmark)**: UPPAAL News - What are we doing in Aalborg?

In this talk I will report on recent developments on UPPAAL and hint on current developments. In particular, I will present UPPAAL-SMC that can be used for performance analysis of hybrid systems.

- Januari 19th, 15.30-16.30
**Michele Volpato**: Test selection by model modification

Since testing is an expensive process, test selection has been proposed as a way to reduce such expense. A good selection of tests can be done using specification coverage functions. Model-based ioco theory, however, uses test suites which are not suitable for computing coverage because of interdependence of their test cases. We define a new test suite that overcomes such problems. Using such a test suite we also cast the test selection problem to a specification selection problem that aims at modifying the model to which a system under test must conform, in order to reduce the set of test cases. We also give a canonical representation for the newly defined test suite.

- Januari 11th, 15.45-16.30
**David Jansen**: Weak simulation is sound and complete. No! Yes! No! Yes! No! ...

Weak simulation for probabilistic systems was defined by Baier, Hermanns, Katoen and Wolf (2003, 2005), including systems involving substochastic distributions. A goal was to establish a simulation relation that is sound and complete w. r. t. the liveness fragment of the logic PCTL\X. Unfortunately, the defined relation is only sound on finite Markov chains (without substochastic distributions) and is not complete. Together with Lijun Zhang and Lei Song, we found a new definition that hopefully is sound and complete.

- November 30th, 15.45-16.30
**Rinus Plasmeijer**: Task Oriented Programming

TBA

- November 23th, 15.45-16.30
**Steffen Michels**: Probabilistic Model-Based Assessment of Information Quality in Uncertain Domains

In various domains, such as security and surveillance, a large amount of information from heterogeneous sources is continuously gathered to identify and prevent potential threats, but it is unknown in advance what the observed entity of interest should look like. The quality of the decisions made depends, of course, on the quality of the information they are based on. In this paper, we propose a novel method for assessing the quality of information taking into account uncertainty. Two properties - soundness and completeness - of the information are used to define the notion of information quality and their expected values are defined using a probabilistic model output. Simulation experiments with data from a maritime scenario demonstrates the usage of the proposed method and its potential for decision support in complex tasks such as surveillance.

- November 16th, 15.45-16.30
**Sicco Verwer**: Learning Automata by Graph Coloring

I show how to map the problem of supervised learning of deterministic finite state automata to a vertex coloring problem. Essentially, such a translation consists of two types of constraints: inequality and equality constraints. Inequality constraints ensure that positive and negative input examples cannot end in the same state (be both accepted or rejected by the automaton). Equality constraints ensure that strings that share their last symbol have to end in the same state if their prefixes end in the same state. Equality constraints greatly complicate the translation to vertex coloring. In my previous work, we therefore translated these constraints to satisfiability instead. We now provide the first translation that encodes these constraints in a pure vertex coloring instance, which has been an open problem for many years.

I will also give an intuitive explanation of automata learning, and discuss why the distinction between equality an inequality constraints is important.

This work was accepted at the conference on algorithmic learning theory (ALT) 2012.

- October 5th, 15.45-16.30, room HG01.058
**David N. Jansen**: TomTom

Heb je je al eens afgevraagd hoe een TomTom werkt? Of hoe een website je een reisroute voorstelt? In deze presentatie leer je hoe een computer de kortste reisroute uitrekent. Bijvoorbeeld hoe je het snelst van Arnhem naar huis kunt rijden. En dat is voor een computer moeilijker dan je zou denken! Maar natuurlijk zijn er allerlei trucjes die ervoor zorgen dat een TomTom toch binnen een paar seconden de juiste weg kan aanwijzen.

- September 28th, 11.00-12.00, room HG00.065
**Wouter Smeenk**will present his MSc thesis Applying Automata Learning to Complex Industrial Software in HG00.065. This work was carried out at Océ Research within the context of the ITALIA project.

Learning models of machines comes very naturally to humans. They construct a mental model of the behaviour of machines while trying out different options. For machines this is much harder. Algorithms have been developed to learn the behaviour of systems and describe them in state machine models. Once such a model has been learned it can be used by software engineers to improve their software. They can simulate and analyse the behaviour of the system, test newer versions of the system and get insight in legacy systems of which no documentation exists.

This master thesis describes a case study done at Oce for the ITALIA project at the Radboud University Nijmegen. The goal of this research was to nd out whether state-of-the-art techniques and tools for automata learning are powerful enough to learn models of industrial control software. Specifically whether the LearnLib tool developed at the University of Dortmund was able to learn a model of the Engine Status Manager (ESM) developed at Oce. The ESM controls the transition from one status to another in a printer. It delegates status requests to the connected hardware components and coordinates their responses. Software like the ESM can be found in many embedded systems in one form or another. Although such the ESM may seem simple, the many details and exceptions involved make it hard to learn.

A system is learned in two alternating phases: A learning phase in which a hypothesis model is created, and a testing phase in which the algorithm searches for a counterexample. This is implemented in the LearnLib. In order to find counterexamples during the test phase using less queries novel techniques had to be developed. Although a correct model of the ESM could not be learnt during this research novel techniques were developed that greatly reduce the number of test queries needed.

- June 15th, 15.30-16.30, room HG02.032
**Maarten van der Heijden**: Combining probabilistic graphical models and temporal reasoning

In clinical practice one often has to deal with uncertainty in temporal processes. Probabilistic graphical models provide a means to model the uncertainty, but the time aspects require some further attention. Using a clinical case study (COPD patient monitoring), two topics will be touched upon: making predictions based on limited data and a first look at modelling temporal indeterminacy.

- June 8th, 15.30-16.30, room HG02.032
**Arjen Hommersom**: A semi-causal Bayesian network approach to prognosis

Arjen will present this paper at the ICML workshop on Machine Learning for Clinical Data Analysis.

Various machine learning techniques have been proposed for the development of prognostic models, including those based on Bayesian networks. An advantage of a Bayesian network compared to many other classifiers is that the model can provide insight by representing the temporal structure of the domain. While it has been shown that Bayesian networks can perform well in terms of classification accuracy, additional temporal domain knowledge can harm the classification performance in realistic applications. We propose to combine a naive parametric distribution with temporal domain knowledge, resulting in semi-causal Bayesian networks. We evaluate this approach in the development of a prognostic model for epithelial ovarian cancer, and argue that the model is understandable for domain experts and comparable to the performance of traditional prognostic models and tree-augmented naive classifiers.

- June 1st, 15.30-16.30, room HG00.086
**Lijun Zhang**Model Checking Algorithms for CTMDPs

Continuous Stochastic Logic (CSL) can be interpreted over continuous-time Markov decision processes (CTMDPs) to specify quantitative properties of stochastic systems that allow some external control. Model checking CSL formulae over CTMDPs requires then the computation of optimal control strategies to prove or disprove a formula. The paper presents a conservative extension of CSL over CTMDPs and exploits established results for CTMDPs for model checking CSL.

- May 25th, 2012, 15.30-16.30, room HG00.086
**Joost Visser**: Models for Assessment of Software Systems

Joost Visser is Head of Research at the Software Improvement Group (SIG) and recently joined the department as part-time professor on Large Scale Software Systems. In this talk, Joost will give a short introduction of SIG's model-based methods for assessing and monitoring software systems. He will also provide an overview of some recent and ongoing research performed by SIG in collaboration with various research partners, and articulate some open research challenges.

- May 11th, 2012, 15.30-16.30, room HG00.633
**Stijn Hoppenbrouwers**: Developing Collaborative Dialogue Games for Model Based System Development

As part of our ongoing research effort to better understand and support processes of (collaboratively) conceptualizing, expressing and communicating (about) models and specifications in a "model driven" system development context, we will present the current state of affairs. We will introduce the main theoretical concepts now underlying our approach, and also some more operational aspects: types of analysis underlying the development of our game setups, as well as some general aspects of the resulting game systems. Also we will present some experiments with Dialogue Game prototypes, and discuss some ongoing developments in industrial application of our ideas.

- May 4th, 2012, 15.30-16.30, room HG01.057
**Ilona Wilmont**: Abstraction and executive control in conceptual modelling

Formal models are abstract representations of reality. Therefore, the ability to form, comprehend and reason with abstractions is vitally important in modelling. However, it is also largely subject to individual differences. We hypothesize that shifts in abstraction levels aid the comprehension of domain knowledge, and that abstraction is facilitated by metacognitive components of executive control.

Humans can form abstractions on different levels, 3 of which have been identified in neuroscientific studies: concrete, medium abstract and highly abstract. Executive control involves behaviours such as inhibiting distractions, flexible shifting of attention, and monitoring progress both in oneself and in others.

We are currently studying modellers in real-life modelling situations. So far, we found that the degree to which participants have a mental representation of the domain affects abstraction. Formation of abstractions and shifts often concurred with explicit monitoring of task, goal, scope and other participants’ behavior. Reasoning within scope was also observed in stakeholders without any modelling background. Without a clear representation, few attempts at abstractions are made. There is no monitoring; instead, the behavioural pattern includes deviations from focus, difficulty understanding the scope of concepts and echoing peers. Apparently, shifts in abstraction levels aid comprehension only if a certain level of understanding is in place beforehand, and that individual differences may override training.

- April 20th, 2012, 16.00-16.30, room HG01.058
**Bas Lijnse**: Incidone: A Task-Oriented Incident Coordination Tool

Coordinating rescue operations for incidents at sea can be a complex task. In this talk we present an ongoing project that aims to develop an incident coordination tool to support it. This tool, Incidone, is based on the specification outlined by Lijnse et al in “Capturing the Netherlands Coast Guard SAR Workflow with iTasks” and is therefore modeled after, but not necessarily limited to, the workflow of the Netherlands Coast Guard. The unique feature of Incidone is that it is the first tool of its kind developed using the Task-Oriented Programming paradigm. Therefore, we present the tool both from the perspective of its intended end-users as well as from the perspective of a software developer. The primary goal of the Incidone project is to provide an example of this method to developers of similar crisis management applications.

- April 13th, 2012, 15.30-16.30, room HG02.032
**Pieter Koopman**: Model Based Testing with Logical Properties versus State Machines

Model-based testing of single functions is usually based on logical properties as specification. In practice it appears to be rather hard to develop a sufficiently strong set of properties to spot all errors. The set of algebraic properties is often insufficinet. For model-based testing of state based systems one usually employs a state machine as model and a conformance relation between this specification and the software under test. For (abstract) data types we can use both ways of model-based testing. In this talk we compare the specification effort required to make an effective model and the effectiveness of the associated tests for some well-known data types. Our examples show that it can be easier to write state based specifications. Moreover, conformance based testing of data types is more effective and very efficient.

- March 30th, 2012, 15.30-16.30, Room HG00.086
**David Jansen**: Belief bisimulation for hidden Markov models: logical characterisation and decision algorithm

David will present his paper for the NASA Formal Methods Workshop with coauthors Flemming Nielson, Lijun Zhang.

This paper establishes connections between logical equivalences and bisimulation relations for hidden Markov models (HMM). Both standard and belief state bisimulations are considered.

We also present decision algorithms for the bisimilarities. For standard bisimilarity, an extension of the usual partition refinement algorithm is enough. Belief bisimilarity, being a relation on the continuous space of belief states, cannot be described directly. Instead, we show how to generate a linear equation system in time cubic in the number of states.

- March 23rd, 2012, 15.00-16.00,
**Fides Aarts**, room HG00.065

Automata Learning Through Counterexample Guided Abstraction Refinement

Abstraction is the key when learning behavioral models of realistic systems. Hence, in most practical applications where automata learning is used to construct models of software components, researchers manually define abstractions which, depending on the history, map a large set of concrete events to a small set of abstract events that can be handled by automata learning tools. In this article, we show how such abstractions can be constructed fully automatically for a restricted class of extended finite state machines in which one can test for equality of data parameters, but no operations on data are allowed. Our approach uses counterexample-guided abstraction refinement: whenever the current abstraction is too coarse and induces nondeterministic behavior, the abstraction is refined automatically. Using Tomte, a prototype tool implementing our algorithm, we have succeeded to learn - fully automatically - models of several realistic software components, including the biometric passport and the SIP protocol.

- March 16th, 2012, Huize Heyendaal, RUN, Geert Grooteplein-Noord 9

Creating the Enterprise: Engineering or Emergence? (registration required)

The Institute for Computing and Information Sciences of Radboud University Nijmegen and the Enterprise Engineering Team cordially invite you to attend Three public lectures on enterprise engineering.

We warmly welcome guest from both industry and academia.

Admission is free, but reservation is required due to limited seats: RSVP (for reservations or cancellations, please send an email to Ingrid Berenbroek: I.Berenbroek@cs.ru.nl)

There is a notable tension between, on the one hand, 'blue print' approaches to enterprise engineering and, on the other, the phenomenon that in many respects, enterprises ?just happen?. Of course things are not so black and white; let's explore! And of course: how can we possible deal with this in organizational design?

Erik Proper, Jan Dietz and James Taylor, all distinguished academic researchers with ample experience in business-IT alignment and organizational design, each approach this theme from their specific personal outlook, followed by a joint discussion.

Programme:

12:30 Arrivals and coffee

13:00 Opening, Stijn Hoppenbrouwers

13:10 Erik Proper:"Engineering and Emergence in Enterprises; a Story of Four Es"

13:55 Jan Dietz: "Enterprise Engineering and Enterprise Ontology"

14:40 Coffee

15:00 James Taylor: "Innovation and the Authoring of the Large Organization, Why the Problem?"

16:00 Discussion among the speakers, led by Stijn Hoppenbrouwers

17:00 Drinks

- March 9nd, 2012, room HG02.028
**Martijn Lappenschaar**: Multilevel Bayesian networks

While Bayesian networks have proved themselves useful for analyzing health care data, they do not have the capability to learn from data that contains some hierarchical structure, which is often the case for real-world medical data. Since this may impact the validity of inference, specialized statistical techniques, such as multilevel regression, are often used to analyze such hierarchical data. In this paper, we combine Bayesian networks with multilevel analysis by introducing multilevel Bayesian networks. We apply these networks to analyze the problem of multimorbidity, which is a significant health-care problem for western societies, especially within the elderly. In particular, we used multilevel Bayesian networks to analyze clinical data from family practices in the Netherlands with the aim to predict heart failure and diabetes mellitus. We compare the resulting model to conventional methods, which reveals at least the same classification performance, but a better insight of interactions between multiple diseases.

- March 2nd, 2012, room 00.058
**Sicco Verwer**: Automata learning using SAT solvers, and learning real-time automata

In this talk, I will present two recent approaches for passively learning state machines. The general problem they aim to solve is to find an automaton model that best describes a given data set of strings (execution traces). One of the main algorithm for solving this problem for deterministic finite state machines is known as state-merging: - start with a tree-like automaton model and - iteratively combine pairs of similar states of this model - until all states are dissimilar Both approaches are based on this simple yet effective greedy learning technique.

The first approach starts with the greedy state-merging algorithm, but stops its execution prematurely in order to solve the remaining part exactly using the power available in modern SAT solvers. Using this technique, we convincingly won the recent Stamina DFA learning competition.

The second approach adds a transition-splitting operation to the state-merging algorithm that can be used to learn the time-bounds of a deterministic real-time automaton. The transitions of such an automaton can contain guards that bound the amount of time that the automaton can remain in the current state before firing (activating) the transition. The resulting transition-splitting and state-merging algorithm is an efficient technique to learn such models from data.

- Februari 17th, 2012, room HG 02.028
**Perry Groot**: Crowdsourcing with Gaussian Processes

Recently there has been a trend of crowdsourcing in which problems traditionally solved by individuals are now solved decentralized by a group of people. Research has shown that this can lead to results having the same quality as when using a single expert. There are a number of factors, however, that can influence the quality of the labels produced by human annotators such as varying levels of expertise, random errors, and input quality/difficulty. In this talk I will discuss various methods to deal with these kind of factors in the context of Gaussian process regression with multiple labels.

- Februari 10th, 2012, room 00.304,
**15:30**

Aia award ceremony with talk by**Erik Meijer** - November 4th, 2011, room 02.028
**Michael Westergaard (TUE) & Bas Lijnse**: Connecting Declare and iTasks: Getting alternative approaches to declarative workflow to work together

This talk will consist of three parts:

- Introduction to Declare.

Declare is a workflow management system based on declarative graphical workflow models. Declare models do not specify any predefined ordering of tasks, but instead models constraints between tasks. As long as a user does not violate the constraints he may choose tasks to do in whatever way he pleases. As choices are not explicit, operational support is needed to assist users. - Declare talks to iTasks talks to Declare.

We have defined a task exchange protocol for iTasks, Declare, and possibly other workflow systems to work together. Both systems embody a different view on workflow management. Hence, connecting them highlights the similarities and differences between their notions of the task concept, control-flow and data-flow. - Demo.

We demonstrate the execution of an example workflow that is distributed over both systems.

- October 28th, 2011, room 02.028
**Wenyun Quan**: Basic properties of class hierarchies regarding probability distributions

Concepts from object orientation have been applied to many fields to facilitate solving complex real-world problems. Medicine is an example of such a complex field, where, however, also the modeling of uncertainty is of major importance. It is our belief that object orientation can also play a role in the medical field to make representing and reasoning with uncertain knowledge easier. However, there is little known about how ideas from object orientation affect the specification and use of probability distributions. In this paper it is studied in what way structured probabilistic models can be organized in class hierarchies. We will provide a theoretical foundation of probabilistic models with object orientation, which are called probabilistic class hierarchies. This is expected to offer a basis for the modeling of complex problems, such as those in medicine, from which the examples used in this paper come.

- October 21th, 2011, room 02.028
**Johan Kwisthout**: Relevant Representations

When computer scientists discuss the computational complexity of, e.g., finding the shortest path from A to B, their starting point typically is a formal description of the problem at hand, e.g., a graph with weights on every edge. Given such a formal description, either tractability or intractability of the problem is established, by proving that the problem enjoys a polynomial time algorithm, respectively is NP-hard. However, this problem description is in fact an abstraction of the actual problem of being in A and desiring to go to B: it focuses on the relevant aspects of the problem (e.g., distances between landmarks and crossings) and leaves out a lot of irrelevant details.

This abstraction step is often overlooked, but may well contribute to the overall complexity of solving the problem at hand. For example, it appears that "going from A to B" is easier to abstract: it is fairly clear that the distance between A and the next crossing is relevant, and that the color of the roof of B is typically not. However, when the problem to be solved is "make X love me", where the current state is (assumed to be) "X doesn't love me", it is hard to agree on all the relevant aspects of this problem.

In this talk, I'll propose a framework for capturing the notion of relevance when it comes to finding a suitable problem representation, with the ultimate goal of formally separating "hard to represent" and "easy to represent" problem instances. This presentation is a reworked version of the presentation I gave at the Dagstuhl seminar "Computer Science & Problem Solving: New Foundations"

*Thursday*October 20th,*14:00*, room 02.032 (ML seminar)**Doris Entner**: Identifying causal effects in linear non-Gaussian acyclic models with latent variables

In many fields of science researchers are faced with the problem of
estimating causal effects among some quantities of interest from
non-experimental data. One common approach is to assume that the data
have been generated by a directed acyclic graph (DAG), and to recover
as much as possible of this graph structure from the observed data. For
the class of linear non-Gaussian acyclic models, Shimizu et al. (2006)
showed that the full graph can be learned efficiently from data if all
relevant variables are included in the analysis. However, in the
presence of latent variables, such as unobserved confounders, inferring
the full model is computationally very expensive (Hoyer et al., 2008).
Hence, for such models it can be beneficial to have ways of efficiently
inferring partial but in the large sample limit correct information
about the causal effects. In this talk, I will present a recently
developed algorithm to learn total pairwise causal effects for all those
pairs which are part of a so called *unconfounded* set, i.e. a set which
fulfills the assumption of including all relevant variables for the
analysis. Furthermore, I will present a statistical test for detecting
inconsistent estimates of the direct causal effect in such models.

*Wednesday*October 19th, 11:00-12:15, room 00.071*(ICIS colloquium)***Carroll Morgan**: Roll-your-own notations for elementary probability

Computer-science style reasoning about programs always benefits from notations that are consistent, are easy to calculate with, and that avoid where possible a dependence on auxiliary definitions that must be given in some surrounding narrative. We are more interested in dry, systematic prose --- and less interested in the mathematical poetry of objects that "enjoy" properties, where facts are said to be "easy to see" and thoughts are linked by whences, thences and hences.

Notations for elementary probability theory seem to respond especially to a formal-methods motivated reorganisation that recognises the essential role of free- and bound variables, of expressions standing in-line for functions and of step-by-step calculation--- at least when the probabilities are applied to program semantics.

In this talk I will present an alternative notation for elementary probability theory that is principally inspired by computer-science notations already in use elsewhere; and I will give examples of it that are based on well known probability puzzles.

The talk is stand-alone, and not at all complicated; but its purpose is not to help to solve puzzles. Rather the motivation for paying the cost of using an alternative notation is provided by the previous talk (and other works), where it allows a much more succinct and reliable presentation of the essential ideas.

*Thursday*October 13th, 16:00, room 03.054**Martijn Hendriks**: Pareto Analysis with Uncertainty

Pareto analysis is a broadly applicable method to model and analyze tradeoffs in multi-objective optimization problems. The set of Pareto optimal solutions is guaranteed to contain the best solution for any arbitrary cost function or selection procedure. This work introduces a method to explicitly take uncertainty into account during Pareto analysis. A solution is not modeled by a single point in the solution space, but rather by a set of such points. This is useful in settings with much uncertainty, such as during model-based design space exploration for embedded systems. A bounding-box abstraction is introduced as a finite representation of Pareto optimal solutions under uncertainty. It is shown that the set of Pareto optimal solutions in the proposed approach still captures exactly the potentially best solutions for any cost function as well as any way of reducing the amount of uncertainty. During model-based design space exploration, for instance, design and implementation choices that are made during the development process reduce the amount of uncertainty. Steps in such a refinement trajectory can render previously Pareto optimal solutions suboptimal. The presented results provide a way to ensure that early selections in the refinement process remain valid.

*Wednesday*October 12th, 11:00-12:15, room 00.307*(ICIS colloquium)***Carroll Morgan**: Semantics for noninterference security: Rhyme and Reason

The Shadow model for qualitative noninterference security of sequential programs is a denotational model, complete with a refinement relation that preserves both functional- and security properties (the latter within its terms of reference). It was derived from a series of "gedanken" experiments in program-refinement algebra, then applied to Kripke structures as used for logics of knowledge.

The Hyperdistribution model for quantitative noninterference was later constructed with the Shadow in mind, but essentially independently. It turns out to have strong structural links to Hidden Markov Models.

The technical component of this talk will be to describe the two kinds of semantics, i.e. the Shadow- (qualitative, possibilistic) and Hyperdistribution- (quantitative, probabilistic) structures we have built for noninterference with a refinement partial order (ie an implementation order that respects secrecy). Unusually, the two models will be described at the same time, interleaved; in this way I will try to bring out their similarities and differences.

If time permits, an example will be given of the kind of program algebra that results, and how the qualitative- and quantitative algebras can work together.

**Thursday**September 22nd, 2011, room 01.058**Steffen Michels**: Composable Shared Data Sources in a Functional Language

Modern software has to deal with a huge amount of data from different sources. Data can be stored at different locations, like in memory, in files, or in databases. Multiple threads, processes and remote machines can access the same data concurrently. The evolution of software also affects the data sources it uses. Over time one might use additional data and change where and how it is stored. Often this requires a huge amount of software refactoring, because data has to be retrieved, combined and updated in a different way.

In this paper a uniform way for dealing with different kinds of shared data sources is proposed. It allows pieces of code to abstract from how data sources are implemented, the only information given is the type of data which can be read and written. To deal with changing data sources, new data sources can be derived from existing ones using functional projections and composition. Therefore, code can be reused when the way data is stored is changed. This kind of abstraction perfectly fits in a pure functional language.

- September 9th, 2011, room 02.028
**Sander Evers**and**Johan Kwisthout**: Probabilistic modelling for adaptive printing

In Océ's professional printers, print quality has to be guaranteed under all circumstances. This is currently done by choosing conservative values for printing temperature and speed. In the OCTOPUS project, we have investigated the use of Bayesian networks to model the uncertain influences in a printer more precisely, resulting in less conservative values, and therefore higher productivity. In several case studies using simulated and real data, we show how these models can be used for control and media recognition. Furthermore, we show how these models can be created from expert knowledge, existing deterministic dynamic models, and available sensor data. This is joint work with Arjen Hommersom.

- July 1st, 2011, room 02.028
**Jozef Hooman**: Compositional Model Checking using Verum's ASD:Suite at Philips Healthcare

The product ASD:Suite of the Dutch company Verum provides a modeling approach which combines compositional model checking and code generation. The main approach of the tool suite will be explained and demonstrated from a user point of view. We present experiences with this tool for the design of control software for interventional X-ray systems at Philips Healthcare. This concerns formal interface definitions between the main system components and detailed design of control components. Finally, we discuss advantages and disadvantages of the approach.

- June 24th, 2011, room 01.057,
**11:00****David Jansen**: Automata-based CSL model checking

For continuous-time Markov chains, the model-checking problem with respect to continuous-time stochastic logic (CSL) has been introduced and shown to be decidable by Aziz, Sanwal, Singhal and Brayton in 1996. The presented decision procedure, however, has exponential complexity. In this paper, we propose an effective approximation algorithm for full CSL.

The key to our method is the notion of stratified CTMCs with respect to the CSL property to be checked. We present a measure-preservation theorem allowing us to reduce the problem to a transient analysis on stratified CTMCs. The corresponding probability can then be approximated in polynomial time (using uniformization). This makes the present work the centerpiece of a broadly applicable full CSL model checker.

Recently, the decision algorithm by Aziz et al. was shown to be incorrect in general. In fact, it works only for stratified CTMCs. As an additional contribution, our measure-preservation theorem can be used to ensure the decidability for general CTMCs.

- May 27th, 2011, room 02.028
**Peter Achten**: Exchanging Sources Between Clean and Haskell -- A Double-Edged Front End for the Clean Compiler

The functional programming languages Clean and Haskell have been around for a while. Over time, both languages have developed a large body of useful libraries and come with interesting language features. It is our primary goal to benefit from each other’s evolutionary results and to facilitate the exchange of sources between Clean and Haskell. This is achieved by using the existing Clean compiler as starting point, and implementing a double-edged front end for this compiler: it supports both standard Clean 2.1 and (currently a large part of) standard Haskell 98. Moreover, it allows both languages to seamlessly use many of each other’s language features that were alien to each other before. For instance, Haskell can now use uniqueness typing anywhere, and Clean can use newtypes efficiently. This has given birth to two new dialects of Clean and Haskell, dubbed Clean* and Haskell*. Additionally, measurements of the performance of the new compiler indicate that it is on par with the flagship Haskell compiler GHC.

- May 20th, 2011, room
**HFML0220****Joeri de Ruiter**(DS dept.): Formal analysis of the EMV protocol suite

EMV is the leading international standard for electronic payments using smartcards, initially developed by EuroPay, Mastercard and Visa. The EMV standard is extremely complex. It involves hundreds of pages of documentation and allows many options and configurations -- including proprietary ones.

We constructed a formal model of the EMV protocol suite in the functional programming language F#. Security analysis is then possible using the protocol verifier ProVerif, via a translation of F# to the applied pi calculus. In the analysis all known weaknesses are revealed.

- May 13th, 2011, room 01.057
**Faranak Heidarian**: Automata Learning Through Counterexample-Guided Abstraction Refinement

State-of-the-art tools for active learning of state machines are able to learn state machines with at most in the order of 10.000 states. This is not enough for learning models of realistic software components which, due to the presence of program variables and data parameters in events, typically have much larger state spaces. Abstraction is the key when learning behavioral models of realistic systems. Hence, in most practical applications where automata learning is used to construct models of software components, researchers manually define abstractions which, depending on the history, map a large set of concrete events to a small set of abstract events that can be handled by automata learning tools. In this article, we show how such abstractions can be constructed fully automatically for a restricted class of extended finite state machines in which one can test for equality of data parameters, but no operations on data are allowed. Our approach uses counterexample-guided abstraction refinement: whenever the current abstraction is too coarse and induces nondeterministic behavior, the abstraction is refined automatically. Using a prototype implementation of our algorithm, we have succeeded to learn -fully automatically- models of several realistic software components, including the biometric passport and the SIP protocol.

- May 6th, 2011, room 02.028,
**11:30****Bas Lijnse**: Capturing the Netherlands Coast Guard's SAR Workflow with iTasks

The dynamic nature of crisis response operations and the rigidity of workflow modelling languages are two things that do not go well together. A recent alternative approach to workflow management systems that allows for more flexibility is the iTask system. It uses an embedded functional language for the specification of workflow models that integrates control-flow with data-flow in dynamic data-dependent workflow specifications. Although there is a variety of publications about the iTask workflow definition language (WDL) and its implementation, its applications have been limited to academic toy examples. To explore the iTasks WDL for crisis response applications, we have developed an iTask specification of the Search And Rescue (SAR) workflow of the Netherlands Coast Guard. In this specification we capture the mix of standard procedures and creative improvisation of which a SAR operation exists.

- April 29th, 2011, room 02.028
**Gert Veldhuijzen van Zanten**(Capgemini): Model Driven Development in Practice

We discuss our experiences in building the computing core of a high performance, mission critical SOA / EDA system. The system involved is responsible for the correct registration, computation and payment of approximately 10 million social benefits per month. We will present:

- Our method (Functional Model-Driven Development)
- The dedicated timed object model, which we developed to allow functional abstraction from peculiarities of time dependent data in a SOA/EDA system, and how FMDD favoured its application

*Speaker*:
Gert Veldhuijzen van Zanten is senior MDD architect at Capgemini. As a member of the Advanced Solutions/FMDD group he works in the areas of model-driven development, domain specific languages and transformation architectures. Gert holds an MSc and a PhD from the University of Twente (The Netherlands). Before his current position at Capgemini, he has been post-doc at the Center for User-System Interaction in Eindhoven, IT consultant at Ordina, and assistant professor in computer science at the Radboud University Nijmegen.

- April 8th, 2011, room 00.065
**Rinus Plasmeijer**: Getting a Grip on Tasks that Coordinate Tasks

Workflow management systems (WFMS) are software systems that coordinate the tasks human workers and computers have to perform to achieve a certain goal. The tasks to do and their interdependencies are described in a Workflow Description Language (WDL). Work can be organized in many, many ways and in the literature already more than hundred of useful workflow patterns for WDL's have been identified.

The *iTask* system is not a WFMS, but a combinator library for the functional language Clean to support the *construction* of WFMS applications. Workflows can be described in a compositional style, using pure functions and combinators as self-contained building blocks. Thanks to the expressive power of the underlying functional language, complex workflows can be defined on top of just a handful of core task combinators. However, it is not sufficient to define the tasks that need to be done. We also need to express the way these tasks are being supervised, managed and visualized. In this talk we report on our current research effort to extend the *iTask* system such that the coordination of work can be defined as special tasks in the system as well. We take the opportunity to redesign editors which share information and the combinators for defining GUI interfaces for tasks, such as buttons, menu's and windows. Even though the expressiveness of the resulting system increases significantly, the number of core combinators can be reduced. We argue that only a few general Swiss-Army-Knife higher order functions are needed to obtain the desired functionality. This simplifies the implementation significantly and increases the maintainability of the system. We discuss the design space and decisions that lead to these general functions for constructing tasks.

- April 1st, 2011, room 00.065
**Freek Verbeek**: Automatic Verification for deadlock in network-on-chip with adaptive routing and wormhole switching

Wormhole switching is a switching technique nowadays commonly used in networks-on-chips (NoCs). It is efficient but prone to deadlock. The design of a deadlock-free adaptive routing function constitutes an important challenge. We present a novel algorithm for the automatic verification that a routing function is deadlock-free in wormhole networks. A sufficient condition for deadlock-free routing and an associated algorithm are defined. The algorithm is proven complete for the condition. The condition, the algorithm, and the correctness theorem have been formalized and checked in the logic of the ACL2 interactive theorem proving system. The algorithm has a time complexity in O(N^3), where N denotes the number of nodes in the network. This outperforms the previous solution of Taktak et al. by one degree. Experimental results confirm the high efficiency of our algorithm. This paper presents a formally proven correct algorithm that detects deadlocks in a 2D-mesh with about 4000 nodes and 15000 channels within seconds.

- March 18th, 2011, room 00.065
**Petur Olsen**: Testing Industrial Printers at Océ & Abstraction Learning within Model Learning

**Testing Industrial Printers at Océ**

In this talk I will present my work on model-based testing of printer control software at Océ. The controller behaves as a function, taking simple inputs and producing simple outputs. However, there are many input and output parameters, and the dependencies between them can be complex. I will present our approach at modelling this system and how we cope with the complexity.

**Abstraction Learning within Model Learning**

Model learning algorithms typically focus on small and finite state systems due to complexity problems. To handle large or infinite state systems, a mapper is placed in between the Learner and the SUT. This mapper can abstract the complex actions of the SUT into manageable abstract actions for the Learner. This mapper has to rely on a priori knowledge about the SUT. We present an approach to automatically learn such abstractions without any knowledge of the internal workings the SUT.

- February 25th, 2011, room 02.028
**Maarten van der Heijden & Bas Lijnse**: Aerial: Intelligently managing COPD-exacerbations with telemedicine

Managing chronic disease through automated systems has the potential to both beneﬁt the patient and reduce health-care costs. We are developing and evaluating a monitoring system for patients with chronic obstructive pulmonary disease which aims to detect exacerbations and thus help patients manage their disease and prevent hospitalisation. We have carefully drafted a system design consisting of an intelligent device that is able to alert the patient, collect case-speciﬁc, subjective and objective, physiological data, offer a patient-speciﬁc in- terpretation of the collected data by means of probabilistic reasoning, and send data to a central server for inspection by health-care professionals. A ﬁrst pilot with actual COPD patients suggests that an intervention based on this system could be successful.

- January 21st, 2011, room 02.032
**Wannes Meert**(K.U. Leuven): Inference for causal processes and CP-logic

The semantics of CP-logic can be intuitively explained by means of the causal process semantics. The outcome of the processes described by a CP-logic theory is a set of possible worlds with to each associated a probability. These semantics are intuitive and elegant but do not describe how to efficiently calculate the probability of these worlds. In our work we have investigated how to efficiently calculate the probability that the processes described by a CP-theory lead to a particular world.

In this talk we focus on three topics with respect to inference for CP-logic. First, we show how the process semantics relate to Bayesian networks and how we can apply known inference methods for Bayesian networks to CP-logic. Bayesian networks, however, are in essence based on conditional (in)dependencies while CP-logic contains contextual and causal (in)dependencies. The second topic centers around exploiting contextual and causal independence in Bayesian networks to improve the efficiency of inference. We propose a method that combines contextual variable elimination and multiplicative factorization called contextual variable elimination with overlapping contexts. Lastly, we discuss the first-order aspects of CP-logic and their impact on inference. This topic handles about the first-order Bayes-ball method that extracts the requisite part of a CP-theory necessary to answer a query on the first-order level. We show how this method simplifies the shattering operation necessary for lifted inference.

- December 10th, 2010, room 00.065
**Arjen Hommersom**: Logical Markov Decision Processes

This presentation introduces a new modular representation of Markov decision processes (MDPs). This representation is based on the language of CP-logic augmented with non-deterministic choices. I will present the semantics of this new language and will propose a method for the formal verification of these models using probabilistic model checking. Furthermore, I will present an application of this logic to the analysis of clinical practice guidelines. As a bonus, I will tell you about your chances on surviving the flu season.

- November 26th, 2010, room 00.058
**Perry Groot**: Introduction to Gaussian Processes

I will try to give a user friendly introduction into Gaussian processes (GPs). A GP is a non-parametric method that can be used for learning smooth functions from data. GPs provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade. Over the last three years I've obtained quite some experience with GPs and believe the method is an excellent tool for addressing many, many scientific problems. Attend and be convinced too!

- October 29th, 2010, room 02.028,
**11:00****Johan Uijen**: Generating Models of Infinite-State Communication Protocols using Regular Inference with Abstraction

In order to facilitate model-based verification and validation, effort is underway to develop techniques for generating models of communication system components from observations of their external behavior. Most previous work has employed regular inference techniques which generate modest-size finite-state models. They typically suppress parameters of messages, although these have a significant impact on control flow in many communication protocols. We present a framework, which adapts regular inference to include data parameters in messages and states for generating components with large or infinite message alphabets. A main idea is to adapt the framework of predicate abstraction, successfully used in formal verification. Since we are in a black-box setting, the abstraction must be supplied externally, using information about how the component manages data parameters. We have implemented our techniques by connecting the LearnLib tool for regular inference with the protocol simulator ns-2, and generated a model of the SIP component as implemented in ns-2.

- October 22nd, 2010, room 01.058
**Johan Kwisthout**: Two new notions of abduction in Bayesian networks

Most Probable Explanation and (Partial) MAP are well-known problems in Bayesian networks that correspond to Bayesian or probabilistic inference of the most probable explanation of observed phenomena given full or partial evidence. These problems have been studied extensively, both from a knowledge engineering starting point as well as a complexity-theoretic point of view; algorithms, both exact and approximate, have been studied exhaustively. In this paper, we introduce two new notions of abduction-like problems in Bayesian networks, motivated from cognitive science, namely the problem of finding the most simple and most informative explanation for a set of variables, given evidence. We define and motivate these problems, show that these problems are computationally intractable in general, but become tractable when some particular constraints are met.

- October 15th, 2010, room
**00**.058**Georgeta Igna**: Verification of Printer Datapaths using Timed Automata

In multiprocessor systems with many data-intensive tasks, a bus may be among the most critical resources. Typically, allocation of bandwidth to one (high-priority) task may lead to a reduction of the bandwidth of other tasks, and thereby effectively slow down these tasks. WCET analysis for these types of systems is a major research challenge. In this paper, we show how the dynamic behavior of a memory bus and a USB in a realistic printer application can be faithfully modeled using timed automata. We analyze, using Uppaal, the worst case latency of scan jobs with uncertain arrival times in a setting where the printer is concurrently processing an infinite stream of print jobs.

- October 8th, 2010, room 01.058
**Philip Hölzenspies**(Uni. Twente): Towards Platforms - How to run applications that do not yet exist

In Embedded Systems, power is a very scarce resource. Both the diversity of applications and the demand for integration into single appliances increase, despite of power problems. Power consumption can be reduced by specializing chips, whereas integration increases when generalizing them. This is why Multi-Processor System-on-Chips (MPSoCs) have become popular; flexibility and integration by offering several processors, all with their own specialization, in a single chip. We like to look at MPSoCs as platforms, but - especially when real-time constraints are involved - the assignment of tasks to processors is a design-time process. This means that two applications that require the same resource can not be run simultaneously and that when a resource becomes faulty, a set of applications can no longer run. This talk is about a way to loosen this constraint, by means of a heuristic approach to run-time resource assignment.

- September 10th, 2010, room 01.058
**David Jansen**: Fortuna: Model Checking Priced Probabilistic Timed Automata

(This presentation is an exercise talk for my QEST 2010 presentation next week.)

We introduce FORTUNA, the first tool for model checking priced probabilistic timed automata (PPTAs). FORTUNA can handle the combination of real-time, probabilistic and cost features, which is required for addressing key design trade-offs that arise in many practical applications. For example the Zeroconf, Bluetooth, IEEE802.11 and Firewire protocols, protocols for sensor networks, and scheduling problems with failures. PPTAs are an extension of probabilistic timed automata (PTAs) with cost-rates and discrete cost increments on states. FORTUNA is able to compute the maximal probability by which a state can be reached under a certain cost- bound (and time-bound). Although this problem is undecidable in general, there exists a semi-algorithm that produces a nondecreasing sequence of probabilities converging to the maximum. This paper presents a number of crucial optimizations of that algorithm. We compare the performance of FORTUNA with existing approaches for PTAs. Surprisingly, although PPTAs are more general, our techniques exhibit superior performance.

- 09/07/10

Colin Runciman, from the University of York (UK), will talk about "The Reduceron -- a graph-reduction machine"**in room 00.071 at 11:00**

Lazy functional programs are appropriately evaluated by graph reduction, where graphs represent expressions with the possibility of sharing and recursive reference. There are high-performance implementations of graph-reduction for stock PCs, but the compilers are complex as PCs are not engineered for this kind of computing.

The Reduceron is a comparatively simple graph-reduction processor implemented in reconfigurable hardware (FPGA). In essence it performs sequential reduction by template instantiation. But the design exploits parallel memory access to realise each of the machine's reduction rules in a single clock-cycle. Other uses of low-level parallelism include dynamic analyses for graph-update avoidance and for the speculative simultaneous reduction of primitive redexes during the instantiation of function bodies.

- 04/06/10

David Jansen will talk about "Discounted CTL for continuous time" in room HG01.028

Discounted CTL is a logic defined by de Alfaro, Faella, Henzinger, Majumdar, and Stoelinga with two main features: (1) Truth values cannot be just 0 or 1, but any real number in the interval [0,1]. (2) Discounting, i. e. a device to give more weight to the near future than to the far future. These features allows to weigh more exactly to which degree a model satisfies its specification. Recently, Axel Legay, Mariëlle Stoelinga and I have worked on adapting DCTL to a setting with continuous-time models. We have defined the semantics on continuous-time Markov decision processes and found model checking algorithms. The presentation will provide our (sometimes preliminary) results and invite discussion.

- 07/05/10

Prof. Zoltán Horváth will give a talk about "Software engineering education in cooperation with industrial partners"

We present our experiences on teaching software engineering in teams which are organized around different R+D projects. These long-running, innovative projects are carried out in cooperation with industrial partners, and are supported by student exchange. While MSc and PhD students work together with faculty staff members on the projects in an industrial-like environment, the students develop skills that would be otherwise very hard for them to obtain. The methodological contributions are illustrated by, and substantiated with, the description of a concrete software engineering project.

- 26/3/10

Wenyun Quan will talk about "Finite-Delay Strategies in Infinite Games"

Based on the work of Even and Meyer, Hosch and Landweber proved that the existence of the solution of Church's Problem by a strategy with finite delay is decidable. In our work we study Church's Problem in the framework of infinite games with MSO-conditions. We construct d-delay attractor strategies in reachability games over the corresponding d-delay game graphs, to determine the existence of winning strategies with a fixed delay for player Output. We illustrate that the function representing the size of the winning region of player Output in a certain game depending on the delay d is monotone, but not necessarily strictly monotone. The problem of deciding whether player Output has a winning strategy with finite delay is raised. We prove that player Output wins a reachability game with finite delay if and only if player Output has a winning strategy with delay 2^n over the corresponding game graph with n vertices. As a consequence, we obtain a new proof for the result of Hosch and Landweber.

- 12/3/10

Ilona Wilmont will talk about "Getting a Grip on Modeling Behavior"

Providing situated modeling support for domain experts, so that they may be actively involved in the modeling process, is our ideal scenario. To work towards this scenario, it is very important to understand the domain expert’s psychology. They are non-technical people who have a very different way of thinking than IT analysts, and therefore we have to appeal to their perspective when designing modeling support programs.

In this talk, I will give an overview of a study in which modeling novices and modeling experts are compared with regard to their ways of modeling. Using a very simple formalism, participants were asked to create an information domain model of a familiar domain, a library. We analyzed both the resulting models and the ways through which those models were created.

We have come up with a hypothesis of modeling behavior, which will now require further validation with a larger test group.

- 19/2/10

Freek Verbeek will talk about "Formal specification of Networks-on-Chips: deadlock and evacuation"

Networks-on-chips (NoC) are emerging as a promising interconnect solution for efficient Multi-Processors Systems-on-Chips. We propose a methodology that supports the specification of parametric NoCs. We provide sufficient constraints that ensure deadlock-free routing, functional correctness, and liveness of the design. To illustrate our method, we discharge these constraints for a parametric NoC inspired by the HERMES architecture.

- 12/2/10

Bas Lijnse will talk about "Towards iTasks for All"

Since its conception as an experimental library for expressing workflows in a functional language, the iTask system has evolved a lot. A major change we implemented last year has been a redesign of the system's core as a services based engine with an Ajax client application.In this talk I will give a short overview of these developments, and zoom in on some of the technical challenges we had to tackle, such as high-level generic interface generation and client-side interpretation. Additionally I will give a demonstration of the current version of the system.

- 5/2/10

Frits Vaandrager will present ongoing work entitled "Learning I/O Automata"

This talk explores links between two widely used models of reactive systems: I/O automata and Mealy machines. We show how any algorithm for active learning of Mealy machines can be used for learning output deterministic I/O automata in the sense of Lynch, Tuttle and Jonsson. The main idea is to place a transducer in between the I/O automata teacher and the Mealy machine learner, which translates concepts from the world of I/O automata to the world of Mealy machines, and vice versa.

- 29/1/10

Niels Radstake will defend his Master thesis about "Learning Bayesian networks using mammographic features"

The most frequent type of cancer among women worldwide is breast cancer. When breast cancer is detected in an early stage, chances of successful treatment are high. Screening programmes have shown to reduce the mortality rate of breast cancer. Studies have shown that radiologists fail to identify a significant number of cases with breast cancer due to misinterpretation. To address the problem of interpretation failure by radiologists, the B-SCREEN project investigates the use of Bayesian networks and Bayesian classifiers. This study focuses on the learning of Bayesian networks from data that is available from the Dutch breast cancer screening programme using different structure and parameter learning techniques. The possibility to use these techniques is verified using experiments. This study concludes that it is possible to use structure and parameter learn- ing techniques to learn Bayesian classifiers that perform reasonable using data from breast cancer screening programmes. The network structures give insights in the correlation of certain variables in the breast cancer domain. However, the performance of these classifiers is still less than when other classification methods are used.

- 22/1/10

Marina Velikova will talk about "Bayesian network modelling for medical screening"

Medical screening is a method for early diagnosis of disease, usually administered to individuals without current symptoms, but who may be at high risk for certain adverse health outcomes. Apart from their proven benefits, screening programs have shown potential adverse effects such as high false positive rates (overdiagnosis) or missed diagnosis. In addition, the involvement of large populations often makes screening costly, time-consuming and creates high working pressure for human experts. As potential assistants to the screening readers, computer-aided detection (CAD) systems have emerged in the last 15 years. One typical application is the CAD for breast cancer screening programs. In this talk, I will present my recent research on the development of a breast cancer CAD system based on Bayesian network models. I will outline the potential benefits and limitations of the proposed methodology and demonstrate its practical application. Finally, I’ll give a short overview of my current work concerning the development of a decision support system using probabilistic graphical models for another medical domain - pregnancy care.

- 18/12/09

Hanno Wupper will give the final presentation out of a series of three talks on education. The title is "Portfolio, maar dan goed" (in Dutch)

PORTFOLIO

Overal in onderwijsland is sinds een par jaar *portfolio* een modieus onderwerp. Bijna iedereen heeft al ervaring met iets dat zo genoemd wordt, velen hadden er last van en bijna niemand vindt het nuttig.

Maar wat is het eigenlijk? Dat wordt helaas vertroebeld door de vele systemen en databanken die zo genoemd worden.

Elke docent moet nu (1) alert zijn en (2) het heft in eigen handen houden om voorbeelden te helpen ontwikkelen voor iets dat zowel nuttig als plezierig om te doen is. Anders krijgen we door de faculteit of universiteit iets opgelegd wat veel werk en frictie veroorzaakt zonder bij te dragen aan goed onderwijs en de ontplooiing van studenten.

Laten we hierover na een korte introductie eens brainstormen.

- 11/12/09

Thomas van Noort will talk about "A Typical Synergy - Dynamic Types and Generalised Algebraic Datatypes"

We present a typical synergy between dynamic types (dynamics) and generalised algebraic datatype (GADTs). The former provides a clean approach to integrating dynamic typing in a statically typed language. It allows values to be wrapped together with their type in a uniform package, deferring type-unification until run-time using a pattern match annotated with the desired type. The latter allows to specify constructor types explicitly, as to enforce their structural validity. In contrast to ADTs, GADTs are heterogeneous structures since each constructor type is implicitly universally quantified. Unfortunately, pattern matching only enforces structural validity and does not provide instantiation information on polymorphic types. Consequently, functions that manipulate such values, such as a type-safe update function, are cumbersome due to boilerplate type representation administration. In this talk we focus on improving such functions by providing a new type annotation of GADT values via a natural synergy with dynamics. We describe the semantics of the type annotation and touch on novel other applications of this technique such as type dispatching and type equality invariants on GADT values.

- 4/12/09

Hanno Wupper will give the second presentation out of a series of three talks on education. The title is "Studenten aan het werk!" (in Dutch)

Abstract (opens new window). - 27/11/09

Sander Evers will talk about "Expressing probabilistic inference with relational algebra"

In this talk, I'll discuss the claim from my Ph.D. thesis that procedures for probabilistic inference (in particular: exact inference in a discrete Bayesian network) are better expressed with relational algebra, the widely used language in which database queries are optimized. Relational algebra combines an efficient operational semantics with the advantages of algebraic transformation. In this framework, an inference query corresponds to a canonical relational expression, and a conventional inference procedure such as variable elimination can be seen as a heuristic to translate this expression into a more efficient one.

Going beyond conventional inference procedures, relational algebra can be used to minimize the cardinality (i.e. the number of nonzero probabilities) of intermediate relations, instead of the dimensionality (the number of probabilistic variables). I demonstrate this using a sensor data example.

For the non-probabilistically inclined, I'll be sure to start the talk with an introduction on probabilistic models and inference.

- 13/11/09 (time: 14.30-15.30; location: HG00.308)

Martijn Moraal will defend his master thesis about "Optimal Deployment of Distributed Systems"

Designing distributed systems is not an easy task. Progress has been made in this regard with the development of formal specification languages and verification tools. One area that is usually not addressed is the deployment of a system. This is unfortunate as the deployment can be critical to the performance.

Finding an optimal deployment is a combinatorial optimization problem; problems that are challenging computationally, with many instances being NP-hard. Different techniques exist to solve these problems, however, what works best for a specific problem is difficult to predict. Mixed integer programming and constraint programming have been considered for the deployment problem in the past. This research extends on this work by investigating the use of constraint-based local search and hybrid methods.

The deployment problem of two distributed system architectures in particular is considered, and methods are presented to solve them. These methods offer novel and sophisticated integration of several techniques. They are not superior to the existing methods in every way, but strike different trade-offs between time that is spend searching for a deployment, and guarantees that are made on the quality of the obtained solution.

- 6/11/09

Johan Kwisthout will talk about "The Computational Complexity of Probabilistic Networks"

Probabilistic networks are widely used in decision support systems, for example in medical domains. It is known that many problems related to these networks are NP-hard and thus intractable in general. However, these problems may become tractable when the domain is restricted. For example, Probabilistic Inference is known to be NP-hard (in fact, is PP-complete) but is polynomial on networks with bounded treewidth.

In my PhD research at Utrecht University, I studied the computational complexity of a number of such problems, all of which are at least NP-hard since they have Probabilistic Inference as a (degenerated) special case. I showed that these problems are complete for classes which are assumed to be larger than NP, i.e., these problems are assumed to be 'more difficult' than NP-complete problems.

In this talk, I will discuss these problems and argue why it makes sense to study the exact complexity of these problems. After introducing the necessary background in complexity theory, I will show how hardness proofs for 'higher' complexity classes can be constructed, in particular for the Parameter Tuning problem which is NP^PP-complete, i.e. is as hard as any problem solvable with a non-deterministic Turing Machine that has access to an oracle for problems in PP. This problem remains NP-complete when restricted to polygraphs, and remains PP-complete for a bounded number of parameters; this is reflected by the current best known algorithm for Parameter Tuning which is exponential in both the treewidth of the graph and the number of parameters; our complexity results show that we can't expect do to much better than that, and that both restrictions are necessary to yield tractable results.

The results presented are based on joint work with Linda van der Gaag, Jan van Leeuwen, Hans Bodlaender and Gerard Tel.

- 16/10/09

Hanno Wupper will talk about "de electronische werkplaats", the first out of a series of three talks on education.

Abstract (opens new window).

- 29/05/09

Concha Bielza, of the Technical University of Madrid talks about "Gaussian Networks, EDAs and Regularization"

In this talk we will present two recent contributions within the field of regularization.

Regularization techniques provide estimates for the linear regression coefficients solving the problems encountered in the "few samples and many variables" setting. The main idea is to shrink the coefficients to zero by imposing a penalty on their size. We will firstly review the main regularization techniques. We will then propose a method for the structure learning of Gaussian Bayesian networks.

The search in an equivalence class search space is combined with regularization techniques, promoting a sparse network learning. Finally, a new regularized logistic regression method based on the evolution of the regression coefficients using estimation of distribution algorithms is presented.

The main novelty is that it avoids the determination of the regularization term. The chosen simulation method of new coefficients at each step of the evolutionary process guarantees their shrinkage as an intrinsic regularization.

- 29/05/09

Juan Antonio del Pozo de Salamanca of the Department of Artificial Intelligence at the Technical University of Madrid talks about "Explaining Clinical Decisions by Extracting Regularity Patterns"

When solving clinical decision-making problems with modern graphical decision-theoretic models such as influence diagrams, we obtain decision tables with optimal decision alternatives describing the best course of action for a given patient or group of patients. For real-life clinical problems, these tables are often extremely large. This is an obstacle to understanding their content. KBM2L lists are structures that minimize memory storage requirements for these tables, and, at the same time, improve their knowledge organization. The resulting improved knowledge organization can be interpreted as explanations of the decision-table content. In this paper, we explore the use of KBM2L lists in analyzing and explaining optimal treatment selection in patients with non-Hodgkin lymphoma of the stomach using an expert-designed influence diagram as an experimental vehicle. The selection of the appropriate treatment for non-Hodgkin lymphoma of the stomach is, as for many other types of cancer, difficult, mainly because of the uncertainties involved in the decision-making process. In this paper we look at an expert-designed clinical influence diagram as a representation of a body of clinical knowledge. This diagram can be analyzed and explained using KBM2L lists. It is shown that the resulting lists provide high-level explanations of optimal treatments for the disease. These explanations are useful for finding relationships between groups of variables and treatments. It is demonstrated that these lists can act as a basis for gaining a deeper understanding of the underlying clinical problem.

Earlier seminars can be found here. Also of interest: