Uncertainty Datatypes
is a Python library that supports uncertain primitive datatypes, including ubool
, sbool
, uint
, ufloat
, uenum
and ustr
. They extend their corresponding Python built-in types (bool
, int
, float
, enum
and str
) with uncertainty. The Uncertainty Datatypes
library implements linear error propagation theory in Python. The repository with the Java implementation of this library is available here.
Uncertain numerical values, ufloat
and uint
, are represented by pairs (x,u)
where x
is the numerical (nominal) value and u
is its associated uncertainty. For example, ufloat(3.5, 0.1)
represents the uncertain real number 3.5 uint(30, 1)
represents the uncertain integer 30
This representation of uncertainty for numerical values follows the "ISO Guide to Measurement Uncertainty" (JCMG 100:2008), where values are represented by the mean and standard deviation of the assumed probability density function representing how measurements of the ground truth value are distributed. For example, if we assume that the values of a variable
Type ubool
extends type bool
by using probabilities instead of the traditional logical truth values (True
, False
), and by replacing truth tables with probability expressions. Thus, an ubool
value is expressed by a probability representing the degree of belief (i.e., the confidence) that a given statement is true. For example, ubool(0.7)
means that there is a 70% chance of an event occurring. Python bool
values True
and False
correspond to ubool(1.0)
and ubool(0.0)
, respectively. ubool
values can be used instead of bool
values, by projecting the probability using a certainty
threshold.
Type sbool
provides an extension of ubool
to represent binomial opinions in Subjective Logic. They allow expressing degrees of belief with epistemic uncertainty, and also trust.
A binomial opinion sbool(b,d,u,a)
where
b
is the degree of belief that X is Trued
is the degree of belief that X is Falseu
is the amount of uncommitted belief, also interpreted as epistemic uncertainty.a
is the prior probability in the absence of belief or disbelief.
These values are all real numbers in the range [0,1], and satisfy that
Type ustr
can be used to represent Python strings with uncertainty. I.e., type ustr
extends type str
, adding to their values a degree of confidence on the contents of the string. This is useful, for example, when rendering strings obtained by inaccurate OCR devices or texts translated from other languages if there are doubts about specific words or phrases. Therefore, values of type ustr
are
pairs (s, c)
, where s
is the nominal string and c
the associated confidence (a real number between 0 and 1). To calculate the confidence of a string s
, the Levenshtein distance is normally used. For example, ustr('hell0 world!', 0.92)
means that we do not trust at most one of the 12 characters of the string. Values of Python type str
are embedded into ustr
values as ustr(s, 1.0)
.
Finally, type uenum
is the embedding supertype for Python type enum
that adds uncertainty to each of its values. A value of an
uncertain enumeration type enum
is not a single literal,
but a set of pairs
All related operations and Mathematical functions on these datatypes are supported. Check the Uncertainty Datatypes User Guide for details.
The Uncertainty Datatypes
library provides a simple implementation of uncertainty for Python built-in datatypes, and implements linear error propagation theory in Python. Uncertainty calculations are performed analytically.
The goal of the library is to support the basic mechanisms for the expression and propagation of uncertainty, in a lightweight and efficient manner.
A distinguishing feature of the Uncertainty Datatypes
library is that comparison operators return ubool
values. This is essential when comparing uncertain numerical values, since their comparison is also subject to uncertainty and this fact must be taken into account. Unfortunately, this feature is not supported by the rest of the related uncertainty libraries, such as the uncertainties package, "soerp" or "mcerp".
Another distinctive feature of Uncertainty Datatypes
library is that it naturally incorporates Subjective logic (type sbool
) into the type system, as a natural extension of probabilistic logic (type ubool
). This enables the seamless combination of different types of uncertainties under the same library, and in particular the representation of both second-order uncertainty and trust. The type embedding mechanisms used in Uncertainty Datatypes
allow operations to be closed in the algebra of types, and where the extended operations work as expected when values of original types are given as input parameters.
Correlations between expressions are not automatically taken into account in Uncertainty Datatypes
. This saves keeping track at all times of all correlations between quantities (variables and functions), improving the performance of the calculations. However, this implies that, by default, we assume that variables are independent. Otherwise, the correlation between dependent variables must be explicitly specified, or dependencies between variables in numerical expressions must be eliminated when possible. For instance, users are expected to simplify numerical expressions as much as possible to avoid duplication of uncertain variables.
In any case, should there be a need to deal with dependent variables, uint
and
ufloat
mathematical operations allow specifying the correlation between them (see the User Guide).
The derivatives of mathematical expressions are not automatically handled by the Uncertainty Datatypes
library, either. Again, this saves keeping track of the value of derivatives, something that also impacts performance. Other unsupported features include automatic handling of arrays of uncertain numbers, or higher-order analysis to error propagation.
In case derivatives or these further features are needed, other libraries that provide these features could be used instead.
-
For example, the uncertainties package supports uncertainty propagation, variable correlation, derivatives, and integration with the NumPy package for scientific computation in Python.
-
soerp is another uncertainty calculation package for Python that provides higher-order approximations of uncertainty. In particular, it supports a second-order analysis to error propagation. Mathematical functions, similar to those in the standard math module, can also be evaluated directly using this package.
-
mcerp provides a stochastic calculator for Monte Carlo methods that uses latin-hypercube sampling to perform non-order specific error propagation (or uncertainty analysis).
The problem is that these implementations are sometimes too slow, e.g., when used in iterative methods. Furthermore, their comparison operations are too basic and not expressive enough: they return crisp boolean values, disregarding the inherent uncertainty that occurs in the comparison between uncertain numerical values. The Uncertainty Datatypes
package successfully addresses these limitations.
In summary, the uncertain datatypes provided by the Uncertainty Datatypes
library is well suited for applications that require the basic mechanisms for the propagation of uncertainty, efficient computation, and a closed algebra of datatypes. In particular, the comparison of two uncertain numeric values returns a probability, i.e., an ubool
value, and subjective logic is implemented as a natural extension of probabilistic logic, and in turn of Boolean logic. More precisely, we implement the following type hierarchy: bool
<: ubool
<: sbool
.
To install the uType
library, use the package manager pip:
pip install uncertainty-datatypes
Note: pip3 may be used instead of pip
You can import all the u-datatypes and functions defined by Uncertainty Datatypes
library as follows:
from uncertainty.utypes import *
The companion Uncertainty Datatypes User Guide provides details about all supported datatypes and its associated operations.
Tests are organized in different files, one for each datatype (e.g. test_uint.py, test_ubool.py). The '.in' files in the folder 'test/test-files' (e.g. ufloatTest.in) contains lot of tests provided using two lines for each test. The first line is the python expresion and the second its result. Tests written in the '.in' files are executed by 'test_files.py'.
All test can be executed using:
pytest
or
python -m unittest
Pull requests are welcome. For major changes, please open an issue to discuss what you would like to change.
-
Copyright (c) 2023 Atenea Research group:
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
The Uncertainty Datatypes
library was initially developed in Java. This is the first version of this Python library (July 2023).
The following papers contain all the details about these datatypes:
- Manuel F. Bertoa, Loli Burgueño, Nathalie Moreno, Antonio Vallecillo. "Incorporating measurement uncertainty into OCL/UML primitive datatypes" Softw. Syst. Model. 19(5):1163-1189, 2020. https://doi.org/10.1007/s10270-019-00741-0
- Paula Muñoz, Loli Burgueño, Victor Ortiz, Antonio Vallecillo. "Extending OCL with Subjective Logic" J. Object Technol. 19(3): 3:1-15, 2020. https://doi.org/10.5381/jot.2020.19.3.a1
Examples of applications of the uncertainty datatypes presented here can be found in the following papers:
- Jean-Marc Jézéquel, Antonio Vallecillo. "Uncertainty-aware Simulation of Adaptive Systems" ACM Transactions on Modeling and Computer Simulation, 33(3):8:1-8:19, 2023. https://doi.org/10.1145/3589517
- Lola Burgueño, Paula Muñoz, Robert Clarisó, Jordi Cabot, Sébastien Gérard, Antonio Vallecillo. "Dealing with Belief Uncertainty in Domain Models" ACM Trans. Softw. Eng. Methodol. 32(2):31:1-31:34, 2023. https://doi.org/10.1145/3542947
- Francisco J. Navarrete, Antonio Vallecillo. "Introducing Subjective Knowledge Graphs" In Proc. of EDOC 2021. pp. 61-70, 2021. https://doi.org/10.1109/EDOC52215.2021.00017
- Nathalie Moreno, Manuel F. Bertoa, Loli Burgueño, Antonio Vallecillo. "Managing Measurement and Occurrence Uncertainty in Complex Event Processing Systems" IEEE Access 7:88026-88048, 2019. https://doi.org/10.1109/ACCESS.2019.2923953
- Victor Ortiz, Loli Burgueño, Antonio Vallecillo, Martin Gogolla. "Native Support for UML and OCL Primitive Datatypes Enriched with Uncertainty in USE" In Proc. of OCL@MoDELS 2019:59-66, 2019. https://ceur-ws.org/Vol-2513/paper5.pdf
- Nathalie Moreno, Manuel F. Bertoa, Gala Barquero, Loli Burgueño, Javier Troya, Adrián García-López, Antonio Vallecillo. "Managing Uncertain Complex Events in Web of Things Applications". In Proc. of ICWE 2018:349-357, 2018. https://doi.org/10.1007/978-3-319-91662-0_28
- Loli Burgueño, Manuel F. Bertoa, Nathalie Moreno, Antonio Vallecillo. "Expressing Confidence in Models and in Model Transformation Elements" In Proc. of MoDELS 2018: 57-66, 2018. https://doi.org/10.1145/3239372.3239394
For more information, please visit our research group's website: