.. |Factor| replace:: :class:`Factor <symfer.factor.Factor>`
.. |Multinom| replace:: :class:`Multinom <symfer.factor.Multinom>`
.. |SumProd| replace:: :class:`SumProd <symfer.factor.SumProd>`

.. _factor algebra:

Factor algebra
==============

In the section :ref:`models`, we have seen that a model consists of *factors*: arrays of (usually) probabilities, together with header information (variable names and domains). In this section, we apply operators to these basic factors, thus constructing an an *expression* in
*factor algebra*.

In |symfer| this is accomplished using the central data type |Factor|. To be precise,
an instance of (a subclass of) |Factor| represents a symbolic expression that will evaluate to a factor. For example, in
the last three lines of

    >>> from symfer import Multinom, SumProd
    >>> 
    >>> weather = {'Weather':['sunny','rainy']}
    >>> sprinkler = {'Sprinkler':['off','on']}
    >>> 
    >>> weather_cpd = Multinom([weather],[0.8,0.2])
    >>> sprinkler_cpd = Multinom([sprinkler, weather], [0.6,0.4,0.99,0.01] )
    >>> joint = SumProd([], [weather_cpd, sprinkler_cpd])

we see three factors being defined; |Multinom| and |SumProd| are both |Factor| subclasses. The |Multinom| instances
are "literal" atomic expressions, where the variables, domains and probabilities are explicitly specified.
The |SumProd| instance is a composite expression, with two operands ``weather_cpd`` and ``sprinkler_cpd`` which
are factor expressions themselves. The |SumProd| expression signifies that these operands should be multiplied, but
the actual multiplication is not performed yet. In effect, as a Python object, a |SumProd| instance is little more
than a container with pointers to the operands.

In parallel to the Python implementation, we also formalize the language of factor expressions in set theory. This
formalization may clarify the meaning of the classes, and is also the source of :ref:`equalities <equalities>` that justify replacing
one factor expression with another. We call this formalization *factor algebra*.

-----------------
Basic definitions
-----------------

We assume a set of variables :math:`\mathcal{V}`, a set of domains :math:`\mathcal{D}` and a function
:math:`\text{vdom} : \mathcal{V} \to \mathcal{D}` that assigns each variable a domain.
A :math:`\setv{V}`-*assignment* is a typed function that maps each variable :math:`X \in \setv{V}` to a value :math:`x_i \in \text{vdom}(X)`.
Here, :math:`\setv{V}` is a subset of :math:`\mathcal{V}`, and is called the *assignment domain*. An assignment :math:`a` is
written using the syntax :math:`a = \{X{=}x_2, Y{=}y_1\}`, which is shorthand for defining function :math:`a` with
:math:`\text{dom}(a) = \{X,Y\}` and values :math:`a(X){=}x_2, a(Y){=}y_1`.

Now, a *factor* on :math:`\setv{V}` is a typed function from :math:`\setv{V}`-assignments to numbers. For example,
factor :math:`f` could map the assignment :math:`\{X{=}x_2, Y{=}y_1\}` to the number :math:`0.42`. This is also
written using a little shorthand:

.. math::

    f(X{=}x_2, Y{=}y_1) = 0.42

i.e. the set braces are omitted. With a little abuse of terminology, the assignment domain :math:`\setv{V}` for
factor :math:`f` is also referred to as the factor's domain, and written :math:`\dom{f} = \setv{V}`. However, keep
in mind that when we write this, we formally mean that :math:`f` maps :math:`\setv{V}`-assignments to numbers.

Note on implementation
----------------------

Although we have just formally defined factors as functions in set theory, their implementation in |symfer| has
little to do with what a computer scientist or software developer would consider a function.
Factors are implemented as *arrays*, and it is often easier to think of them that way. The array appears when we
list all possible assignments and corresponding function values:

.. table::
    :column-alignment: center center
    :header-columns: 1

    +-------------------------------+------------------------------------+
    | assignment                    | array of :math:`f` function values |
    +===============================+====================================+
    | :math:`\{X{=}x_0,Y{=}y_0\}`   |   :math:`0.35`                     |
    +-------------------------------+------------------------------------+
    | :math:`\{X{=}x_1,Y{=}y_0\}`   |   :math:`0.29`                     |
    +-------------------------------+------------------------------------+
    | :math:`\{X{=}x_2,Y{=}y_0\}`   |   :math:`0.36`                     |
    +-------------------------------+------------------------------------+
    | :math:`\{X{=}x_0,Y{=}y_1\}`   |   :math:`0.51`                     |
    +-------------------------------+------------------------------------+
    | :math:`\{X{=}x_1,Y{=}y_1\}`   |   :math:`0.07`                     |
    +-------------------------------+------------------------------------+
    | :math:`\{X{=}x_2,Y{=}y_1\}`   |   :math:`0.42`                     |
    +-------------------------------+------------------------------------+
    
This is represented in |symfer| by storing:

- A |Factor| attribute ``domlist == ['X','Y']``, listing the variables in :math:`\dom{f}` in a certain order
- A |Factor| attribute ``domtypes``, a dict containing :math:`\text{vdom}(X)` and :math:`\text{vdom}(Y)`, for example
  ``domtypes['X'] == ['x0','x1']``
- The array of function values, in the above order (the first variable in ``domlist`` changes fastest in the assignment,
  and its values are enumerated in the order given by ``domtypes['X']``).
  In |Multinom| factors, this array is listed in attribute ``par``. In other |Factor| subclasses, the array is not
  stored at all, as the |Factor| merely represents a symbolic expression and not its result. Later, 
  when the symbolic expression is evaluated (during the *numeric phase*), it is indeed constructed as a native array,
  which is faster than a Python list. The different :ref:`evaluators <evaluation options>`, whose task this is,
  actually do use the same order. 

--------------------
Operators on factors
--------------------

An operator we already came across in the introduction at the top of this page is *factor multiplication*.
The product of factor :math:`f` and factor :math:`g`
is written :math:`f \prodjoin g`, and is itself a factor with domain :math:`\dom{f} \cup \dom{g}`. Each of its values is
a product of an :math:`f`-value and a :math:`g`-value. For example, if :math:`\dom{f} = \{X,Y\}` and
:math:`\dom{g} = \{Y,Z\}`, then :math:`\dom{f \prodjoin g} = \{X,Y,Z\}`, and

.. math::

    (f \prodjoin g)(X{=}x,Y{=}y,Z{=}z) &= f(X{=}x,Y{=}y) \cdot g(Y{=}y,Z{=}z)\\
    & \qquad\text{for each $x \in \text{vdom}(X), y \in \text{vdom}(Y), z \in \text{vdom}(Z)$}

It is probably easier to get an idea of factor multiplication in terms of the arrays. Let's assume that :math:`f`
is as shown in the table above, and the array of :math:`g` is like this:

.. table::
    :column-alignment: center center center
    :header-columns: 1

    +-------------------------------+-----------------------------+-----------------------------+
    |                               | :math:`,Z{=}z_0\}`          | :math:`,Z{=}z_1\}`          |
    +===============================+=============================+=============================+
    | :math:`\{Y{=}y_0,\ldots`      | :math:`\color{blue}0.1`     |   :math:`\color{violet}0.3` |
    +-------------------------------+-----------------------------+-----------------------------+
    | :math:`\{Y{=}y_1,\ldots`      | :math:`\color{purple}0.2`   |   :math:`\color{teal}0.4`   |
    +-------------------------------+-----------------------------+-----------------------------+

Then the array of :math:`f \prodjoin g` is like this:

.. table::
    :column-alignment: center center center
    :header-columns: 1

    +----------------------------------+--------------------------------------+--------------------------------------+
    |                                  | :math:`,Z{=}z_0\}`                   | :math:`,Z{=}z_1\}`                   |
    +==================================+======================================+======================================+
    | :math:`\{X{=}x_0,Y{=}y_0,\ldots` | :math:`0.35 \cdot \color{blue}0.1`   | :math:`0.35 \cdot \color{violet}0.3` |
    +----------------------------------+--------------------------------------+--------------------------------------+
    | :math:`\{X{=}x_1,Y{=}y_0,\ldots` | :math:`0.29 \cdot \color{blue}0.1`   | :math:`0.29 \cdot \color{violet}0.3` |
    +----------------------------------+--------------------------------------+--------------------------------------+
    | :math:`\{X{=}x_2,Y{=}y_0,\ldots` | :math:`0.36 \cdot \color{blue}0.1`   | :math:`0.36 \cdot \color{violet}0.3` |
    +----------------------------------+--------------------------------------+--------------------------------------+
    | :math:`\{X{=}x_0,Y{=}y_1,\ldots` | :math:`0.51 \cdot \color{purple}0.2` | :math:`0.51 \cdot \color{teal}0.4`   |
    +----------------------------------+--------------------------------------+--------------------------------------+
    | :math:`\{X{=}x_0,Y{=}y_1,\ldots` | :math:`0.07 \cdot \color{purple}0.2` | :math:`0.07 \cdot \color{teal}0.4`   |
    +----------------------------------+--------------------------------------+--------------------------------------+
    | :math:`\{X{=}x_0,Y{=}y_1,\ldots` | :math:`0.42 \cdot \color{purple}0.2` | :math:`0.42 \cdot \color{teal}0.4`   |
    +----------------------------------+--------------------------------------+--------------------------------------+

.. note::
    
    The arrays are shown two-dimensionally here for legibility. If it were possible, we would have shown both the
    :math:`f` and :math:`g` arrays two-dimensionally, and the :math:`f \prodjoin g` array *three*-dimensionally,
    with each dimension matching one variable. As it is, the :math:`X` and :math:`Y` variables are collapsed into
    one dimension.

The second important operator is *factor summation*. This works a little different from normal summation; it
does not take two factor operands but only one (let's say :math:`h`). The other argument is a set of variables
:math:`\setv{W} \subseteq \dom{h}`. These variables are summed out of the resulting factor. For example, if
:math:`\dom{h} = \{X,Y,Z\}` and we want to sum out :math:`Y`, we write :math:`\sumproj_{\{Y\}} h`, or actually
:math:`\sumproj_{Y} h` without the set braces because we are rather lazy than fussy. The resulting factor has
:math:`\dom{\sumproj_{Y} h} = \{X,Z\}` and values

.. math::
    
   (\sumproj_{Y} h)(X{=}x,Z{=}z) &= \sum_{y \in \text{vdom}(Y)} h(X{=}x,Y{=}y,Z{=}z)\\
   & \qquad\text{for each $x \in \text{vdom}(X), z \in \text{vdom}(Z)$}

Continuing our example, with :math:`h \isdef f \prodjoin g`, the values of :math:`\sumproj_{Y} h` are:

.. table::
    :column-alignment: center center center
    :header-columns: 1

    +--------------------------+---------------------------------------------------------------------+---------------------------------------------------------------------+
    |                          | :math:`,Z{=}z_0\}`                                                  | :math:`,Z{=}z_1\}`                                                  |
    +==========================+=====================================================================+=====================================================================+
    | :math:`\{X{=}x_0,\ldots` | :math:`0.35 \cdot {\color{blue}0.1} + 0.51 \cdot \color{purple}0.2` | :math:`0.35 \cdot {\color{violet}0.3} + 0.51 \cdot \color{teal}0.4` |
    +--------------------------+---------------------------------------------------------------------+---------------------------------------------------------------------+
    | :math:`\{X{=}x_1,\ldots` | :math:`0.29 \cdot {\color{blue}0.1} + 0.07 \cdot \color{purple}0.2` | :math:`0.29 \cdot {\color{violet}0.3} + 0.07 \cdot \color{teal}0.4` |
    +--------------------------+---------------------------------------------------------------------+---------------------------------------------------------------------+
    | :math:`\{X{=}x_2,\ldots` | :math:`0.36 \cdot {\color{blue}0.1} + 0.42 \cdot \color{purple}0.2` | :math:`0.36 \cdot {\color{violet}0.3} + 0.42 \cdot \color{teal}0.4` |
    +--------------------------+---------------------------------------------------------------------+---------------------------------------------------------------------+


These are the most important operators in factor algebra; for the other, we refer to

    Sander Evers, Peter J. F. Lucas: *Marginalization without Summation: Exploiting Determinism in Factor Algebra.*
    In: Weiru Lu (Ed.), *Symbolic and Quantitative Approaches to Reasoning with Uncertainty - 11th European Conference,
    ECSQARU 2011, Proceedings*, pp. 251-262
    `[SpringerLink] <http://link.springer.com/chapter/10.1007%2F978-3-642-22152-1_22>`_
    `[preprint pdf] <http://www.sandr.dds.nl/publications/factorindexing-ecsqaru.pdf>`_

.. _`equalities`:

----------
Equalities
----------

The definitions in factor algebra lead to some equalities, which are perhaps not very surprising, but useful to be
aware of:

.. math::

    f \prodjoin \I &= f\\
    f \prodjoin g &= g \prodjoin f\\
    f \prodjoin (g \prodjoin h) &= (f \prodjoin g) \prodjoin h\\
    \sumproj_{V}\sumproj_{W} f &= \sumproj_{W}\sumproj_{V} f =\sumproj_{V,W}f\\
    \sumproj_{V} \left( f \prodjoin g \right) &= \sumproj_{V} f \prodjoin g
    \qquad\text{if $V {\in} \dom{f}, V {\notin} \dom{g}$}\\
    \sumproj_{V} \left( f \prodjoin g \right) &= f \prodjoin \sumproj_{V} g
    \qquad\text{if $V {\notin} \dom{f}, V {\in} \dom{g}$}\\ 
    \bigl( \sumproj_{V} f \bigr)[W{=}d] &= \sumproj_{V} f[W{=}d]
    \qquad\text{if $V{\neq}W, V{\notin}\dom{d}$}

--------------------------
Factor algebra to |Factor|
--------------------------

Operators in factor algebra correspond to subclasses of |Factor|. However, this correspondence is not completely
one-to-one. The differences are:

- In a |Factor|, the *order* of the variables is defined. It is stored in the field ``domlist``.
- Factor sum and product are merged into one operator |SumProd|. The reason for this is that this operator can
  be implemented efficiently; the summation is done in-place during the multiplication, so the operator does
  not require the memory to store the entire product.
- Operands of type |Factor| are always put in a list (even if there can be only one). This way an expression tree
  can be traversed in a uniform way. This list is always kept in a field ``fac``.
- Similarly, deterministic factor operands (factors with integer values instead of probabilities) are kept in a
  field ``det``.

A reason for this design is to keep a one-to-one correspondence with the :ref:`YAML representation`. To remove some
of the unpracticalities of directly working with this representation, convenience functions are introduced.
For example, instead of constructing :math:`f \prodjoin g` as ``symfer.SumProd([],[f,g])`` one can also use
``f.product(g)``. These convenience functions also behave smart in the sense that they take basic factor algebra
simplifications into account. For example, if either ``f`` or ``g`` is the identity factor ``I()``, ``product``
does not return a ``SumProd`` expression at all, but simply the other factor.

An overview of available factor algebra operators and their corresponding constructors and convenience functions
is shown in the following table.

.. table::
    :widths: 1 1 1
    :column-alignment: center left left

    +---------------------------------------+-----------------------------------+---------------------------------+
    |factor algebra                         | |Factor| constructor              | |Factor| convenience function   |
    +=======================================+===================================+=================================+
    |.. math::                              |::                                 |::                               |
    |                                       |                                   |                                 |
    |    \sumproj_{A,B} (f \prodjoin g)     |    SumProd(['A','B'],[f,g])       |    f.product(g).sumto(['A','B'])|
    |                                       |                                   |                                 |
    |                                       |                                   |or::                             |
    |                                       |                                   |                                 |
    |                                       |                                   |    I().product(f,g).sumto(...)  |
    |                                       |                                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |.. math::                              |::                                 |                                 |
    |                                       |                                   |                                 |
    |    f(A{=}0,B{=}b_1)&=0.25\\           |   f = Multinom([{'A':[0,1]},..],  |                                 |
    |    f(A{=}1,B{=}b_1)&=0.75\\           |                [0.25,0.75,..])    |                                 |
    |    f(A{=}0,B{=}b_2)&=\ldots           |                                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |.. math::                              |::                                 |                                 |
    |                                       |                                   |                                 |
    |    d(A{=}0,B{=}b_1)&=\text{true}\\    |   d = Fun([{'C':[False,True]}],   |                                 |
    |    d(A{=}1,B{=}b_1)&=\text{false}\\   |           [{'A':[0,1],..],        |                                 |
    |    d(A{=}0,B{=}b_2)&=\ldots           |           [1,0,..])               |                                 |
    |                                       |                                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |.. math::                              |::                                 |::                               |
    |                                       |                                   |                                 |
    |    h[C{=}d]                           |    Index([d],[h])                 |    h.index(d)                   |
    |                                       |                                   |                                 |
    |                                       |                                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |.. math::                              |::                                 |::                               |
    |                                       |                                   |                                 |
    |    h[C{=}\text{true}]                 |  Index([Fun([{'C':[False,True]}], |    h.index(C=True)              |
    |                                       |             [],[1])],             |                                 |
    |                                       |        [h])                       |                                 |
    |                                       |                                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |.. math::                              |::                                 |                                 |
    |                                       |                                   |                                 |
    |    \I                                 |    I()                            |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |.. math::                              |::                                 |::                               |
    |                                       |                                   |                                 |
    |    \I_{C{=}d}                         |    Embed([d])                     |    d.embed()                    |
    |                                       |                                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+
    |                                       |::                                 |::                               |
    |                                       |                                   |                                 |
    |                                       |    Reorder([{'newB':'B'},         |    f.reorder([{'newB':'B'},     |
    |                                       |             {'newA':'A'}],        |               {'newA':'A'}])    |
    |                                       |            [f])                   |                                 |
    +---------------------------------------+-----------------------------------+---------------------------------+