Skip to content

Commit

Permalink
Add privacy preserving tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
jfrery committed Sep 26, 2023
1 parent 290b17f commit 418695a
Showing 1 changed file with 78 additions and 0 deletions.
78 changes: 78 additions & 0 deletions doc/tutorials/privacy_preserving.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
#############################################
Privacy Preserving Inference with Concrete ML
#############################################

Concrete ML is a specialized library allowing the execution of machine learning models on encrypted data through Fully Homomorphic Encryption (FHE), thereby preserving data privacy.

To use models such as XGBClassifier, use the following import:

.. code:: python
from concrete.ml.sklearn import XGBClassifier
***************************************
Performing Privacy Preserving Inference
***************************************

Initialization of a XGBClassifier can be done as follows:

.. code:: python
classifier = XGBClassifier(n_bits=6, [other_hyperparameters])
where ``n_bits`` determines the precision of the input features. Note that a higher value of ``n_bits`` results in increased precision but also longer FHE execution time.

Other hyper parameters that exist in xgboost library can be used.

******************************
Model Training and Compilation
******************************

As commonly used in scikit-learn like models, it can be trained with the .fit() method.

.. code:: python
classifier.fit(X_train, y_train)
After training, the model can be compiled with a calibration dataset, potentially a subset of the training data:

.. code:: python
classifier.compile(X_calibrate)
This calibration dataset, ``X_calibrate``, is used in Concrete ML compute the precision (bit-width) of each intermediate value in the model. This is a necessary step to optimize the equivalent FHE circuit.

****************************
FHE Simulation and Execution
****************************

To verify model accuracy in encrypted computations, you can run an FHE simulation:

.. code:: python
predictions = classifier.predict(X_test, fhe="simulate")
This simulation is can be used to evaluate the model. The resulting accuracy of this simulation step is representative of the actual FHE execution without having to pay the cost of an actual FHE execution.

When the model is ready, actual Fully Homomorphic Encryption execution can be performed:

.. code:: python
predictions = classifier.predict(X_test, fhe="execute")
Note that fhe="execute" does not preserve privacy as X_test is not encrypted here; instead it allows developers to assess the model. For privacy-preserving inferences, the model must be deployed. Concrete ML provides a deployment API to facilitate this process, ensuring end-to-end privacy, although the specifics of the deployment process are outside the scope of this tutorial. Please refer to:
- the `deployment documentation <https://docs.zama.ai/concrete-ml/advanced-topics/client_server>`_
- the `deployment notebook <https://github.com/zama-ai/concrete-ml/blob/17779ca571d20b001caff5792eb11e76fe2c19ba/docs/advanced_examples/ClientServer.ipynb>`_

*******************************
Parameter Tuning in Concrete ML
*******************************

Concrete ML is compatible with standard scikit-learn pipelines such as GridSearchCV or any other hyper-parameter tuning technique.

**********
Conclusion
**********

Concrete ML provides a framework for executing privacy-preserving inferences by leveraging Fully Homomorphic Encryption, allowing secure and private computations on encrypted data.

0 comments on commit 418695a

Please sign in to comment.