BNN verification dataset

This repository contains the source code and related information for the binarized neural networks (BNNs) verification datasets submitted to the Max-SAT Evaluation 2020 and MIPLIB 2024.

Usage
Max-SAT evaluation 2020
Talk at NII Shonan Meeting No. 180 “The Art of SAT”
MIPLIB 2024

Usage

Preparation

$ pip3 install -r requirements.txt

(Optional) Training model

Trained model weights are included in the models/ directory:

Use those models to get the same problem instances we submitted to the Max-SAT Evaluation 2020.

models/Verifying_Properties_of_Binarized_Deep_Neural_Networks.ipynb is the notebook we used for training those models on Google Colaboratory. Note, however, that this code is older than the other codes in this repository and should be used with caution.

Max-SAT instance generation

Run the following to generate the same data set submitted to Max-SAT Evaluation 2020:

$ python3 generate_maxsat_instances.py --dataset mnist --model models/mnist.npz -o outdir \
--format wcnf --card totalizer --norm inf --target adversarial --instances-per-class 2
$ python3 generate_maxsat_instances.py --dataset mnist_rot --model models/mnist_rot.npz -o outdir \
--format wcnf --card totalizer --norm inf --target adversarial --instances-per-class 2
$ python3 generate_maxsat_instances.py --dataset mnist_back_image --model models/mnist_back_image.npz -o outdir \
--format wcnf --card totalizer --norm inf --target adversarial --instances-per-class 2

You can also specify an individual sample by using --instance-no instead of --instances-per-class.

MIP instance generation

You can use generate_mip_instances.py instead:

$ python3 generate_mip_instances.py --dataset mnist --model models/mnist.npz -o outdir \
--norm inf --target adversarial --instances-per-class 2
$ python3 generate_mip_instances.py --dataset mnist_rot --model models/mnist_rot.npz -o outdir \
--norm inf --target adversarial --instances-per-class 2
$ python3 generate_mip_instances.py --dataset mnist_back_image --model models/mnist_back_image.npz -o outdir \
--norm inf --target adversarial --instances-per-class 2

Validating solutions

Once the solver successfully solves a problem instance, you can check the solution as follows:

$ python3 verify_solution.py --dataset mnist --instance 7 \
  --output-image perturbated.png \
  --output-orig-image orig.png \
  --format maxsat \
  SOLUTION_FILE

This converts the solution in the SOLUTION_FILE to an image file named perturbated.png, and also provides some information:

model's prediction (probability distribution over the digit classes and predicted class) on the original image and the perturbated image, and
the norms of the perturbation.

You need to provide a dataset and an instance number equal to the ones used to generate the problem. If you generated a problem using --instances-per-class, you can find the instance number from the filename.

--format specify the format of SOLUTION_FILE:

--format maxsat: SOLUTION_FILE is a file containing the output of Max-SAT solver (only v lines are used). The logs/maxino-pref-fixed/ directory contains log files that you can try.
--format gurobi: SOLUTION_FILE is a solution file of Gurobi. The miplib2024_submission/additional_files/solutions/ directory contains some examples.

Example result

This is the case for bnn_mnist_rot_10_label4_adversarial_norm_inf_totalizer.wcnf.

	Image	Prediction of a model	P(y=0) (logit)	P(y=1) (logit)	P(y=2) (logit)	P(y=3) (logit)	P(y=4) (logit)	P(y=5) (logit)	P(y=6) (logit)	P(y=7) (logit)	P(y=8) (logit)	P(y=9) (logit)
Original image		4	3.1416737e-14 (8.883254)	5.5133663e-22 (-8.975005)	1.2148612e-05 (28.656395)	7.593513e-20 (-4.049718)	0.9997013 (39.974392)	1.711211e-12 (12.88087)	3.8705436e-10 (18.302235)	0.00028651825 (31.816982)	5.633235e-12 (14.072353)	4.0916482e-11 (16.055202)
Perturbated image		6	4.5545687e-10 (12.883254)	2.6813108e-21 (-12.975005)	0.0032257813 (28.656395)	1.7916893e-10 (11.950282)	0.0016309624 (27.97439)	0.004037595 (28.880869)	0.91325474 (34.302235)	0.07607825 (31.816982)	4.4588405e-06 (22.072353)	0.0017682364 (28.055202)

Added perturbation:

L₀-norm: 18.0
L₁-norm: 18.0
L₂-norm: 4.242640687119285
L_∞-norm: 1.0

Max-SAT evaluation 2020

Submission to Max-SAT evaluation 2020

Description
Submitted instances (29.62 GB)

Result in the Max-SAT Evaluation 2020

The competition results and organizer's slides are available on the competition website.

Among submitted 60 instance, 5 instances (maxsat2020_bnn_verification_used.tar.gz, 2.5 GB) were used in the competition:

Instance	Image	Label
bnn_mnist_7_label9_adversarial_norm_inf_totalizer.wcnf.gz		9
bnn_mnist_back_image_32_label3_adversarial_norm_inf_totalizer.wcnf.gz		3
bnn_mnist_rot_16_label5_adversarial_norm_inf_totalizer.wcnf.gz		5
bnn_mnist_rot_8_label1_adversarial_norm_inf_totalizer.wcnf.gz		1
bnn_mnist_back_image_73_label5_adversarial_norm_inf_totalizer.wcnf.gz		5

Solving time (in seconds; 3600.0 means timeout):

Instance	maxino-pref	maxino	Pacose	UWrMaxSat	MaxHS	QMaxSAT	RC2-B / RC2-A / smax-minisat / smax-mergesat
…	270.62	269.06	402.17	648.45	991.52	141.42	3600.0
…	279.84	277.76	1101.24	795.81	1733.77	1729.06	3600.0
…	367.28	367.06	221.87	657.69	1006.6	704.83	3600.0
…	84.87	84.06	347.71	588.25	1083.57	3600.0	3600.0
…	2215.51	2232.61	3600.0	3600.0	3600.0	3600.0	3600.0

Optimum values and solution examples generated by maxino-pref-fixed:

Instance	Minimum ǁτǁ_∞	Predicted Label	Predicted Label
…	1	9	5
…	2	3	8
…	1	5	7
…	1	1	3
…	4	5	3

†: These are obtained by executing maxino-pref-fixed locally, so they may differ from those obtained during the contest.

Talk at NII Shonan Meeting No. 180 “The Art of SAT”

Some follow-ups

Q: In several samples used in the contest, the images do not look like the numbers shown as their labels.
- A: This problem was caused by my misunderstanding of the order of the features in MNIST-rot and MNIST-back-image datasets (MNIST does not have this problem). Thereby images were rotated and flipped from their original form. The features should have been reordered in the preprocessing during dataset creation. However, this is a visualization-only issue, since training and inference treat data in a consistent manner.
Q: What happens if two classes have the same maximum logit value?
- A: It is common to return a class with the smallest index in actual implementations (e.g. numpy.argmax and torch.argmax). However, safety property relying on such an assumption is not robust (in particular in the case of floating point numbers). Therefore, we didn't specify which of the maximum-logit classes is chosen and allowed one of them to be chosen non-determinically, in a way similar to how unspecified behavior is modeled with non-determinism in model checking safety properties. In other words, we are checking whether it is possible for an incorrect class to have a logit value at least as large as the one for the correct class.
Q: Are there images of successfully perturbated cases?
- A: I added several examples above.

Q: You said that using sequential counters produced a much larger file than using the totalizer. However, their sizes should be close both theoretically and empirically.

A: Since I didn't keep the results of experiments with sequential counters, I re-run the experiment using the sequential counter. The result for the 7th data in the MNIST test dataset (label = 9) in the case of L_∞ norm is as follows:

encoding	file size (uncompressed file)	#variables	#constraints	Download
sequential counters	5,076,826,688 B	64,168,245	192,424,087	bnn_mnist_7_label9_adversarial_norm_inf_sequential.wcnf.zst
totalizer	3,871,189,890 B	1,824,676	132,670,200	bnn_mnist_7_label9_adversarial_norm_inf_totalizer.wcnf.zst
parallel counters	94,898,008 B	516,006	3,227,479	bnn_mnist_7_label9_adversarial_norm_inf_parallel.wcnf.zst
(PB constraints in WBO file)	3,215,049 B	1,452	2,191	bnn_mnist_7_label9_adversarial_norm_inf.wbo.zst

The file size and the number of constraints are certainly larger in the sequential counters case, but the difference is not that much (although there is a huge difference in the number of variables). Therefore, it appears that I misremembered the results.

MIPLIB 2024

See miplib2024_submission/ directory.

References

MaxSAT Evaluation 2020 : Solver and Benchmark Descriptions

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
examples		examples
images		images
logs		logs
maxsat2020		maxsat2020
miplib2024_submission		miplib2024_submission
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
bnn.py		bnn.py
datasets.py		datasets.py
encoder.py		encoder.py
encoder_mip.py		encoder_mip.py
generate_maxsat_instances.py		generate_maxsat_instances.py
generate_mip_instances.py		generate_mip_instances.py
requirements.txt		requirements.txt
verify_solution.py		verify_solution.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BNN verification dataset

Usage

Preparation

(Optional) Training model

Max-SAT instance generation

MIP instance generation

Validating solutions

Example result

Max-SAT evaluation 2020

Submission to Max-SAT evaluation 2020

Result in the Max-SAT Evaluation 2020

Talk at NII Shonan Meeting No. 180 “The Art of SAT”

Some follow-ups

MIPLIB 2024

References

About

Releases

Packages

Languages

License

msakai/bnn-verification

Folders and files

Latest commit

History

Repository files navigation

BNN verification dataset

Usage

Preparation

(Optional) Training model

Max-SAT instance generation

MIP instance generation

Validating solutions

Example result

Max-SAT evaluation 2020

Submission to Max-SAT evaluation 2020

Result in the Max-SAT Evaluation 2020

Talk at NII Shonan Meeting No. 180 “The Art of SAT”

Some follow-ups

MIPLIB 2024

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages