Development

Updated: 22 Dec 2023

This document is meant for the day when I (Gavin D. Howard) get hit by a bus. In other words, it's meant to make the bus factor a non-issue.

This document is supposed to contain all of the knowledge necessary to develop bc and dc.

In addition, this document is meant to add to the oral tradition of software engineering, as described by Bryan Cantrill.

This document will reference other parts of the repository. That is so a lot of the documentation can be closest to the part of the repo where it is actually necessary.

What Is It?

This repository contains an implementation of both POSIX bc and Unix dc.

POSIX bc is a standard utility required for POSIX systems. dc is a historical utility that was included in early Unix and even predates both Unix and C. They both are arbitrary-precision command-line calculators with their own programming languages. bc's language looks similar to C, with infix notation and including functions, while dc uses Reverse Polish Notation and allows the user to execute strings as though they were functions.

In addition, it is also possible to build the arbitrary-precision math as a library, named bcl.

Note: for ease, I will refer to both programs as bc in this document. However, if I say "just bc," I am referring to just bc, and if I say dc, I am referring to just dc.

History

This project started in January 2018 when a certain individual on IRC, hearing that I knew how to write parsers, asked me to write a bc parser for his math library. I did so. I thought about writing my own math library, but he disparaged my programming skills and made me think that I couldn't do it.

However, he took so long to do it that I eventually decided to give it a try and had a working math portion in two weeks. It taught me that I should not listen to such people.

From that point, I decided to make it an extreme learning experience about how to write quality software.

That individual's main goal had been to get his bc into toybox, and I managed to get my own bc in. I also got it in busybox.

Eventually, in late 2018, I also decided to try my hand at implementing Karatsuba multiplication, an algorithm that that unnamed individual claimed I could never implement. It took me a bit, but I did it.

This project became a passion project for me, and I continued. In mid-2019, Stefan Eßer suggested I improve performance by putting more than 1 digit in each section of the numbers. After I showed immaturity because of some burnout, I implemented his suggestion, and the results were incredible.

Since that time, I have gradually been improving the bc as I have learned more about things like fuzzing, scan-build, valgrind, AddressSanitizer (and the other sanitizers), and many other things.

One of my happiest moments was when my bc was made the default in FreeBSD. Another happiest moment was when I found out that my bc had shipped with macOS Ventura, without my knowledge.

But since I believe in finishing the software I write, I have done less work on bc over time, though there are still times when I put a lot of effort in, such as now (17 June 2021), when I am attempting to convince OpenBSD to use my bc.

And that is why I am writing this document: someday, someone else is going to want to change my code, and this document is my attempt to make it as simple as possible.

Values

According to Bryan Cantrill, all software has values. I think he's correct, though I added one value for programming languages in particular.

However, for bc, his original list will do:

Approachability
Availability
Compatibility
Composability
Debuggability
Expressiveness
Extensibility
Interoperability
Integrity
Maintainability
Measurability
Operability
Performance
Portability
Resiliency
Rigor
Robustness
Safety
Security
Simplicity
Stability
Thoroughness
Transparency
Velocity

There are several values that don't apply. The reason they don't apply is because bc and dc are existing utilities; this is just another reimplementation. The designs of bc and dc are set in stone; there is nothing we can do to change them, so let's get rid of those values that would apply to their design:

Compatibility
Integrity
Maintainability
Measurability
Performance
Portability
Resiliency
Rigor
Robustness
Safety
Security
Simplicity
Stability
Thoroughness
Transparency

Furthermore, some of the remaining ones don't matter to me, so let me get rid of those and order the rest according to my actual values for this project:

Robustness
Stability
Portability
Compatibility
Performance
Security
Simplicity

First is robustness. This bc and dc should be robust, accepting any input, never crashing, and instead, returning an error.

Closely related to that is stability. The execution of bc and dc should be deterministic and never change for the same inputs, including the pseudo-random number generator (for the same seed).

Third is portability. These programs should run everywhere that POSIX exists, as well as Windows. This means that just about every person on the planet will have access to these programs.

Next is compatibility. These programs should, as much as possible, be compatible with other existing implementations and standards.

Then we come to performance. A calculator is only usable if it's fast, so these programs should run as fast as possible.

After that is security. These programs should never be the reason a user's computer is compromised.

And finally, simplicity. Where possible, the code should be simple, while deferring to the above values.

Keep these values in mind for the rest of this document, and for exploring any other part of this repo.

Portability

But before I go on, I want to talk about portability in particular.

Most of these principles just require good attention and care, but portability is different. Sometimes, it requires pulling in code from other places and adapting it. In other words, sometimes I need to duplicate and adapt code.

This happened in a few cases:

Option parsing (see include/opt.h).
History (see include/history.h).
Pseudo-Random Number Generator (see include/rand.h).

This was done because I decided to ensure that bc's dependencies were basically zero. In particular, either users have a normal install of Windows or they have a POSIX system.

A POSIX system limited me to C99, sh, and zero external dependencies. That last item is why I pull code into bc: if I pull it in, it's not an external dependency.

That's why bc has duplicated code. Remove it, and you risk bc not being portable to some platforms.

Suggested Course

I do have a suggested course for programmers to follow when trying to understand this codebase. The order is this:

bc Spec.
Manpages.
Test suite.
Understand the build.
Algorithms manual.
Code concepts.
Repo structure.
Headers.
Source code.

This order roughly follows this order:

High-level requirements
Low-level requirements
High-level implementation
Low-level implementation

In other words, first understand what the code is supposed to do, then understand the code itself.

Useful External Tools

I have a few tools external to bc that are useful:

A Vim plugin with syntax files made specifically for my bc and dc.
A repo of bc and dc scripts.
A set of bash aliases (see below).
A .bcrc file with items useful for my bash setup (see below).

My bash aliases are these:

alias makej='make -j16'
alias mcmake='make clean && make'
alias mcmakej='make clean && make -j16'
alias bcdebug='CPPFLAGS="-DBC_DEBUG_CODE=1" CFLAGS="-Weverything -Wno-padded \
    -Wno-switch-enum -Wno-format-nonliteral -Wno-cast-align \
    -Wno-unreachable-code-return -Wno-missing-noreturn \
    -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \
    -pedantic -std=c99" ./configure.sh'
alias bcconfig='CFLAGS="-Weverything -Wno-padded -Wno-switch-enum \
    -Wno-format-nonliteral -Wno-cast-align -Wno-unreachable-code-return \
    -Wno-missing-noreturn -Wno-disabled-macro-expansion -Wno-unreachable-code \
    -Wall -Wextra -pedantic -std=c99" ./configure.sh'
alias bcnoassert='CPPFLAGS="-DNDEBUG" CFLAGS="-Weverything -Wno-padded \
    -Wno-switch-enum -Wno-format-nonliteral -Wno-cast-align \
    -Wno-unreachable-code-return -Wno-missing-noreturn \
    -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \
    -pedantic -std=c99" ./configure.sh'
alias bcdebugnoassert='CPPFLAGS="-DNDEBUG -DBC_DEBUG_CODE=1" \
    CFLAGS="-Weverything -Wno-padded -Wno-switch-enum -Wno-format-nonliteral \
    -Wno-cast-align -Wno-unreachable-code-return -Wno-missing-noreturn \
    -Wno-disabled-macro-expansion -Wno-unreachable-code -Wall -Wextra \
    -pedantic -std=c99" ./configure.sh'
alias bcunset='unset BC_LINE_LENGTH && unset BC_ENV_ARGS'

makej runs make with all of my cores.

mcmake runs make clean before running make. It will take a target on the command-line.

mcmakej is a combination of makej and mcmake.

bcdebug configures bc for a full debug build, including BC_DEBUG_CODE (see Debugging below).

bcconfig configures bc with Clang (Clang is my personal default compiler) using full warnings, with a few really loud and useless warnings turned off.

bcnoassert configures bc to not have asserts built in.

bcdebugnoassert is like bcnoassert, except it also configures bc for debug mode.

bcunset unsets my personal bc environment variables, which are set to:

export BC_ENV_ARGS="-l $HOME/.bcrc"
export BC_LINE_LENGTH="74"

Unsetting these environment variables are necessary for running scripts/release.sh because otherwise, it will error when attempting to run bc -s on my $HOME/.bcrc.

Speaking of which, the contents of that file are:

define void print_time_unit(t){
	if(t<10)print "0"
	if(t<1&&t)print "0"
	print t,":"
}
define void sec2time(t){
	auto s,m,h,d,r
	r=scale
	scale=0
	t=abs(t)
	s=t%60
	t-=s
	m=t/60%60
	t-=m
	h=t/3600%24
	t-=h
	d=t/86400
	if(d)print_time_unit(d)
	if(h)print_time_unit(h)
	print_time_unit(m)
	if(s<10)print "0"
	if(s<1&&s)print "0"
	s
	scale=r
}
define minutes(secs){
	return secs/60;
}
define hours(secs){
	return secs/3600;
}
define days(secs){
	return secs/3600/24;
}
define years(secs){
	return secs/3600/24/365.25;
}
define fbrand(b,p){
	auto l,s,t
	b=abs(b)$
	if(b<2)b=2
	s=scale
	t=b^abs(p)$
	l=ceil(l2(t),0)
	if(l>scale)scale=l
	t=irand(t)/t
	scale=s
	return t
}
define ifbrand(i,b,p){return irand(abs(i)$)+fbrand(b,p)}

This allows me to use bc as part of my bash prompt.

Code Style

The code style for bc is...weird, and that comes from historical accident.

In History, I mentioned how I got my bc in toybox. Well, in order to do that, my bc originally had toybox style. Eventually, I changed to using tabs, and assuming they were 4 spaces wide, but other than that, I basically kept the same style, with some exceptions that are more or less dependent on my taste.

However, I later managed to get ClangFormat to work, so I changed the style to that.

ClangFormat

The style is now defined as whatever ClangFormat outputs using the existing .clang-format file. More precisely, the style is whatever is output when the following command is run in the root directory:

./scripts/format.sh

Historical Style

The code style used to be:

Tabs are 4 spaces.
Tabs are used at the beginning of lines for indent.
Spaces are used for alignment.
Lines are limited to 80 characters, period.
Pointer asterisk (*) goes with the variable (on the right), not the type, unless it is for a pointer type returned from a function.
The opening brace is put on the same line as the header for the function, loop, or if statement.
Unless the header is more than one line, in which case the opening brace is put on its own line.
If the opening brace is put on its own line, there is no blank line after it.
If the opening brace is not put on its own line, there is a blank line after it, unless the block is only one or two lines long.
Code lines are grouped into what I call "paragraphs." Basically, lines that seem like they should go together are grouped together. This one comes down to judgment.
Bodies of if statements, else statements, and loops that are one line long are put on the same line as the statement, unless the header is more than one line long, and/or, the header and body cannot fit into 80 characters with a space inbetween them.
If single-line bodies are on a separate line from their headers, and the headers are only a single line, then no braces are used.
However, braces are always used if they contain another if statement or loop.
Loops with empty bodies are ended with a semicolon.
Expressions that return a boolean value are surrounded by paretheses.
Macro backslashes are aligned as far to the left as possible.
Binary operators have spaces on both sides.
If a line with binary operators overflows 80 characters, a newline is inserted after binary operators.
Function modifiers and return types are on the same line as the function name.
With one exception, goto's are only used to jump to the end of a function for cleanup.
All structs, enums, and unions are typedef'ed.
All constant data is in one file: src/data.c, but the corresponding extern declarations are in the appropriate header file.
All local variables are declared at the beginning of the scope where they appear. They may be initialized at that point, if it does not invoke UB or otherwise cause bugs.
All precondition assert()'s (see Asserts) come after local variable declarations.
Besides short if statements and loops, there should never be more than one statement per line.

Repo Structure

Functions are documented with Doxygen-style doc comments. Functions that appear in headers are documented in the headers, while static functions are documented where they are defined.

`configure`

A symlink to configure.sh.

`configure.sh`

This is the script to configure bc and bcl for building.

This bc has a custom build system. The reason for this is because of portability.

If bc used an outside build system, that build system would be an external dependency. Thus, I had to write a build system for bc that used nothing but C99 and POSIX utilities.

One of those utilities is POSIX sh, which technically implements a Turing-complete programming language. It's a terrible one, but it works.

A user that wants to build bc on a POSIX system (not Windows) first runs configure.sh with the options he wants. configure.sh uses those options and the Makefile template (Makefile.in) to generate an actual valid Makefile. Then make can do the rest.

For more information about the build process, see the Build System section and the build manual.

For more information about shell scripts, see POSIX Shell Scripts.

configure.sh does the following:

It processes command-line arguments and figure out what the user wants to build.
It reads in Makefile.in.
One-by-one, it replaces placeholders (in Makefile.in) of the form %%<placeholder_name>%% based on the build type.
It appends a list of file targets based on the build type.
It appends the correct test targets.
It copies the correct manpage and markdown manual for bc and dc into a location from which they can be copied for install.
It does a make clean to reset the build state.

`.gitattributes`

A .gitattributes file. This is needed to preserve the crlf line endings in the Visual Studio files.

`.gitignore`

The .gitignore

`LICENSE.md`

This is the LICENSE file, including the licenses of various software that I have borrowed.

`Makefile.in`

This is the Makefile template for configure.sh to use for generating a Makefile.

For more information, see configure.sh, the Build System section, and the build manual.

Because of portability, the generated Makefile.in should be a pure POSIX make-compatible Makefile (minus the placeholders). Here are a few snares for the unwary programmer in this file:

No extensions allowed, including and especially GNU extensions.
If new headers are added, they must also be added to Makefile.in.
Don't delete the .POSIX: empty target at the top; that's what tells make implementations that pure POSIX make is needed.

In particular, there is no way to set up variables other than the = operator. There are no conditionals, so all of the conditional stuff must be in configure.sh. This is, in fact, why configure.sh exists in the first place: POSIX make is barebones and only does a build with no configuration.

`NEWS.md`

A running changelog with an entry for each version. This should be updated at the same time that include/version.h is.

`NOTICE.md`

The NOTICE file with proper attributions.

`README.md`

The README. Read it.

`benchmarks/`

The folder containing files to generate benchmarks.

Each of these files was made, at one time or another, to benchmark some experimental feature, so if it seems there is no rhyme or reason to these benchmarks, it is because there is none, besides historical accident.

`bc/`

The folder containing bc scripts to generate bc benchmarks.

`add.bc`

The file to generate the benchmark to benchmark addition in bc.

`arrays_and_constants.bc`

The file to generate the benchmark to benchmark bc using lots of array names and constants.

`arrays.bc`

The file to generate the benchmark to benchmark bc using lots of array names.

`constants.bc`

The file to generate the benchmark to benchmark bc using lots of constants.

`divide.bc`

The file to generate the benchmark to benchmark division in bc.

`functions.bc`

The file to generate the benchmark to benchmark bc using lots of functions.

`irand_long.bc`

The file to generate the benchmark to benchmark bc using lots of calls to irand() with large bounds.

`irand_short.bc`

The file to generate the benchmark to benchmark bc using lots of calls to irand() with small bounds.

`lib.bc`

The file to generate the benchmark to benchmark bc using lots of calls to heavy functions in lib.bc.

`multiply.bc`

The file to generate the benchmark to benchmark multiplication in bc.

`newton_raphson_div_large.bc`

The file to generate the benchmark to benchmark the Newton-Raphson division in GitHub PR #72 with large numbers.

`newton_raphson_div_small.bc`

The file to generate the benchmark to benchmark the Newton-Raphson division in GitHub PR #72 with small numbers.

`newton_raphson_sqrt_large.bc`

The file to generate the benchmark to benchmark the Newton-Raphson square root in GitHub PR #72 with large numbers.

`newton_raphson_sqrt_small.bc`

The file to generate the benchmark to benchmark the Newton-Raphson square root in GitHub PR #72 with small numbers.

`postfix_incdec.bc`

The file to generate the benchmark to benchmark bc using postfix increment and decrement operators.

`power.bc`

The file to generate the benchmark to benchmark power (exponentiation) in bc.

`subtract.bc`

The file to generate the benchmark to benchmark subtraction in bc.

`strings.bc`

The file to generate the benchmark to benchmark bc using lots of strings.

`dc/`

The folder containing dc scripts to generate dc benchmarks.

`modexp.dc`

The file to generate the benchmark to benchmark modular exponentiation in dc.

`gen/`

A folder containing the files necessary to generate C strings that will be embedded in the executable.

All of the files in this folder have license headers, but the program and script that can generate strings from them include code to strip the license header out before strings are generated.

`bc_help.txt`

A text file containing the text displayed for bc -h or bc --help.

This text just contains the command-line options and a short summary of the differences from GNU and BSD bc's. It also directs users to the manpage.

The reason for this is because otherwise, the help would be far too long to be useful.

Warning: The text has some printf() format specifiers. You need to make sure the format specifiers match the arguments given to bc_file_printf().

`dc_help.txt`

A text file containing the text displayed for dc -h or dc --help.

This text just contains the command-line options and a short summary of the differences from GNU and BSD dc's. It also directs users to the manpage.

The reason for this is because otherwise, the help would be far too long to be useful.

Warning: The text has some printf() format specifiers. You need to make sure the format specifiers match the arguments given to bc_file_printf().

`lib.bc`

A bc script containing the standard math library required by POSIX. See the POSIX standard for what is required.

This file does not have any extraneous whitespace, except for tabs at the beginning of lines. That is because this data goes directly into the binary, and whitespace is extra bytes in the binary. Thus, not having any extra whitespace shrinks the resulting binary.

However, tabs at the beginning of lines are kept for two reasons:

Readability. (This file is still code.)
The program and script that generate strings from this file can remove tabs at the beginning of lines.

For more details about the algorithms used, see the algorithms manual.

However, there are a few snares for unwary programmers.

First, all constants must be one digit. This is because otherwise, multi-digit constants could be interpreted wrongly if the user uses a different ibase. This does not happen with single-digit numbers because they are guaranteed to be interpreted what number they would be if the ibase was as high as possible.

This is why A is used in the library instead of 10, and things like 2*9*A for 180 in lib2.bc.

As an alternative, you can set ibase in the function, but if you do, make sure to set it with a single-digit number and beware the snare below...

Second, scale, ibase, and obase must be safely restored before returning from any function in the library. This is because without the -g option, functions are allowed to change any of the globals.

Third, all local variables in a function must be declared in an auto statement before doing anything else. This includes arrays. However, function parameters are considered predeclared.

Fourth, and this is only a snare for lib.bc, not lib2.bc, the code must not use any extensions. It has to work when users use the -s or -w flags.

`lib2.bc`

A bc script containing the extended math library.

Like lib.bc, and for the same reasons, this file should have no extraneous whitespace, except for tabs at the beginning of lines.

For more details about the algorithms used, see the algorithms manual.

Also, be sure to check lib.bc for the snares that can trip up unwary programmers when writing code for lib2.bc.

`strgen.c`

Code for the program to generate C strings from text files. This is the original program, although strgen.sh was added later.

The reason I used C here is because even though I knew sh would be available (it must be available to run configure.sh), I didn't know how to do what I needed to do with POSIX utilities and sh.

Later, strgen.sh was contributed by Stefan Eßer of FreeBSD, showing that it could be done with sh and POSIX utilities.

However, strgen.c exists still exists because the versions generated by strgen.sh may technically hit an environmental limit. (See the draft C99 standard, page 21.) This is because strgen.sh generates string literals, and in C99, string literals can be limited to 4095 characters, and gen/lib2.bc is above that.

Fortunately, the limit for "objects," which include char arrays, is much bigger: 65535 bytes, so that's what strgen.c generates.

However, the existence of strgen.c does come with a cost: the build needs C99 compiler that targets the host machine. For more information, see the "Cross Compiling" section of the build manual.

Read the comments in strgen.c for more detail about it, the arguments it takes, and how it works.

`strgen.sh`

An sh script that will generate C strings that uses only POSIX utilities. This exists for those situations where a host C99 compiler is not available, and the environment limits mentioned above in strgen.c don't matter.

strgen.sh takes the same arguments as strgen.c, and the arguments mean the exact same things, so see the comments in strgen.c for more detail about that, and see the comments in strgen.sh for more details about it and how it works.

For more information about shell scripts, see POSIX Shell Scripts.

`include/`

A folder containing the headers.

The headers are not included among the source code because I like it better that way. Also there were folders within src/ at one point, and I did not want to see #include "../some_header.h" or things like that.

So all headers are here, even though only one (bcl.h) is meant for end users (to be installed in INCLUDEDIR).

`args.h`

This file is the API for processing command-line arguments.

`bc.h`

This header is the API for bc-only items. This includes the bc_main() function and the bc-specific lexing and parsing items.

The bc parser is perhaps the most sensitive part of the entire codebase. See the documentation in bc.h for more information.

The code associated with this header is in src/bc.c, src/bc_lex.c, and src/bc_parse.c.

`bcl.h`

This header is the API for the bcl library.

This header is meant for distribution to end users and contains the API that end users of bcl can use in their own software.

This header, because it's the public header, is also the root header. That means that it has platform-specific fixes for Windows. (If the fixes were not in this header, the build would fail on Windows.)

The code associated with this header is in src/library.c.

`dc.h`

This header is the API for dc-only items. This includes the dc_main() function and the dc-specific lexing and parsing items.

The code associated with this header is in src/dc.c, src/dc_lex.c, and src/dc_parse.c.

`file.h`

This header is for bc's internal buffered I/O API.

For more information about bc's error handling and custom buffered I/O, see Error Handling and Custom I/O, along with status.h and the notes about version 3.0.0 in the NEWS.

The code associated with this header is in src/file.c.

`history.h`

This header is for bc's implementation of command-line editing/history, which is based on a UTF-8-aware fork of linenoise.

For more information, see the Command-Line History section.

The code associated with this header is in src/history.c.

`lang.h`

This header defines the data structures and bytecode used for actual execution of bc and dc code.

Yes, it's misnamed; that's an accident of history where the first things I put into it all seemed related to the bc language.

The code associated with this header is in src/lang.c.

`lex.h`

This header defines the common items that both programs need for lexing.

The code associated with this header is in src/lex.c, src/bc_lex.c, and src/dc_lex.c.

`library.h`

This header defines the things needed for bcl that users should not have access to. In other words, bcl.h is the public header for the library, and this header is the private header for the library.

The code associated with this header is in src/library.c.

`num.h`

This header is the API for numbers and math.

The code associated with this header is in src/num.c.

`opt.h`

This header is the API for parsing command-line arguments.

It's different from args.h in that args.h is for the main code to process the command-line arguments into global data after they have already been parsed by opt.h into proper tokens. In other words, opt.h actually parses the command-line arguments, and args.h turns that parsed data into flags (bits), strings, and expressions that will be used later.

Why are they separate? Because originally, bc used getopt_long() for parsing, so args.h was the only one that existed. After it was discovered that getopt_long() has different behavior on different platforms, I adapted a public-domain option parsing library to do the job instead. And in doing so, I gave it its own header.

They could probably be combined, but I don't really care enough at this point.

The code associated with this header is in src/opt.c.

`parse.h`

This header defines the common items that both programs need for parsing.

Note that the parsers don't produce abstract syntax trees (AST's) or any intermediate representations. They produce bytecode directly. In other words, they don't have special data structures except what they need to do their job.

The code associated with this header is in src/parse.c, src/bc_lex.c, and src/dc_lex.c.

`program.h`

This header defines the items needed to manage the data structures in lang.h as well as any helper functions needed to generate bytecode or execute it.

The code associated with this header is in src/program.c.

`rand.h`

This header defines the API for the pseudo-random number generator (PRNG).

The PRNG only generates fixed-size integers. The magic of generating random numbers of arbitrary size is actually given to the code that does math (src/num.c).

The code associated with this header is in src/rand.c.

`read.h`

This header defines the API for reading from files and stdin.

Thus, file.h is really for buffered output, while this file is for input. There is no buffering needed for bc's inputs.

The code associated with this header is in src/read.c.

`status.h`

This header has several things:

A list of possible errors that internal bc code can use.
Compiler-specific fixes.
Platform-specific fixes.
Macros for bc's error handling.

There is no code associated with this header.

`vector.h`

This header defines the API for the vectors (resizable arrays) that are used for data structures.

Vectors are what do the heavy lifting in almost all of bc's data structures. Even the maps of identifiers and arrays use vectors.

The code associated with this header is in src/vector.c.

`version.h`

This header defines the version of bc.

There is no code associated with this header.

`vm.h`

This header defines the API for setting up and running bc and dc.

It is so named because I think of it as the "virtual machine" of bc, though that is probably not true as program.h is probably the "virtual machine" API. Thus, the name is more historical accident.

The code associated with this header is in src/vm.c.

`locales/`

This folder contains a bunch of .msg files and soft links to the real .msg files. This is how locale support is implemented in bc.

The files are in the format required by the gencat POSIX utility. They all have the same messages, in the same order, with the same numbering, under the same groups. This is because the locale system expects those messages in that order.

The softlinks exist because for many locales, they would contain the exact same information. To prevent duplication, they are simply linked to a master copy.

The naming format for all files is:

<language_code>_<country_code>.<encoding>.msg

This naming format must be followed for all locale files.

`manuals/`

This folder contains the documentation for bc, dc, and bcl, along with a few other manuals.

`algorithms.md`

This file explains the mathematical algorithms that are used.

The hope is that this file will guide people in understanding how the math code works.

`bc.1.md.in`

This file is a template for the markdown version of the bc manual and manpages.

For more information about how the manpages and markdown manuals are generated, and for why, see scripts/manpage.sh and Manuals.

`bcl.3`

This is the manpage for the bcl library. It is generated from bcl.3.md using scripts/manpage.sh.

For the reason why I check generated data into the repo, see scripts/manpage.sh and Manuals.

`bcl.3.md`

This is the markdown manual for the bcl library. It is the source for the generated bcl.3 file.

`benchmarks.md`

This is a document that compares this bc to GNU bc in various benchmarks. It was last updated when version 3.0.0 was released.

It has very little documentation value, other than showing what compiler options are useful for performance.

`build.md`

This is the build manual.

This bc has a custom build system. The reason for this is because of portability.

If bc used an outside build system, that build system would be an external dependency. Thus, I had to write a build system for bc that used nothing but C99 and POSIX utilities, including barebones POSIX make.

for more information about the build system, see the build system section, the build manual, configure.sh, and Makefile.in.

`dc.1.md.in`

This file is a template for the markdown version of the dc manual and manpages.

For more information about how the manpages and markdown manuals are generated, and for why, see scripts/manpage.sh and Manuals.

`development.md`

The file you are reading right now.

`header_bcl.txt`

Used by scripts/manpage.sh to give the bcl.3 manpage a proper header.

For more information about generating manuals, see scripts/manpage.sh and Manuals.

`header_bc.txt`

Used by scripts/manpage.sh to give the generated bc manpages a proper header.

For more information about generating manuals, see scripts/manpage.sh and Manuals.

`header_dc.txt`

Used by scripts/manpage.sh to give the generated dc manpages a proper header.

For more information about generating manuals, see scripts/manpage.sh and Manuals.

`header.txt`

Used by scripts/manpage.sh to give all generated manpages a license header.

For more information about generating manuals, see scripts/manpage.sh and Manuals.

`release.md`

A checklist that I try to somewhat follow when making a release.

`bc/`

A folder containing the bc manuals.

Each bc manual corresponds to a build type. See that link for more details.

For each manual, there are two copies: the markdown version generated from the template, and the manpage generated from the markdown version.

`dc/`

A folder containing the dc manuals.

Each dc manual corresponds to a build type. See that link for more details.

For each manual, there are two copies: the markdown version generated from the template, and the manpage generated from the markdown version.

`scripts/`

This folder contains helper scripts. Most of them are written in pure POSIX sh, but three (afl.py, karatsuba.py, and randmath.py) are written in Python 3, and one (ministat.c) is written in C. ministat.c in particular is copied from elsewhere.

For more information about the shell scripts, see POSIX Shell Scripts.

`afl.py`

This script is meant to be used as part of the fuzzing workflow.

It does one of two things: checks for valid crashes, or runs bc and or dc under all of the paths found by AFL++.

See Fuzzing for more information about fuzzing, including this script.

`alloc.sh`

This script is a quick and dirty script to test whether or not the garbage collection mechanism of the BcNum caching works. It has been little-used because it tests something that is not important to correctness.

`benchmark.sh`

A script making it easy to run benchmarks and to run the executable produced by ministat.c on them.

For more information, see the Benchmarks section.

`bitfuncgen.c`

A source file for an executable to generate tests for bc's bitwise functions in gen/lib2.bc. The executable is scripts/bitfuncgen, and it is built with make bitfuncgen. It produces the test on stdout and the expected results on stderr. This means that to generat tests, use the following invokation:

scripts/bitfuncgen > tests/bc/bitfuncs.txt 2> tests/bc/bitfuncs_results.txt

It calls abort() if it runs into an error.

`exec-install.sh`

This script is the magic behind making sure dc is installed properly if it's a symlink to bc. It checks to see if it is a link, and if so, it just creates a new symlink in the install directory. Of course, it also installs bc itself, or dc when it's alone.

`functions.sh`

This file is a bunch of common functions for most of the POSIX shell scripts. It is not supposed to be run; instead, it is sourced by other POSIX shell scripts, like so:

. "$scriptdir/functions.sh"

or the equivalent, depending on where the sourcing script is.

For more information about the shell scripts, see POSIX Shell Scripts.

`fuzz_prep.sh`

Fuzzing is a regular activity when I am preparing for a release.

This script handles all the options and such for building a fuzzable binary. Instead of having to remember a bunch of options, I just put them in this script and run the script when I want to fuzz.

For more information about fuzzing, see Fuzzing.

`karatsuba.py`

This script has at least one of two major differences from most of the other scripts:

It's written in Python 3.
It's meant for software packagers.

For example, scripts/afl.py and scripts/randmath.py are both in Python 3, but they are not meant for the end user or software packagers and are not included in source distributions. But this script is.

This script breaks my rule of only POSIX utilities necessary for package maintainers, but there's a very good reason for that: it's only meant to be run once when the package is created for the first time, and maybe not even then.

You see, this script does two things: it tests the Karatsuba implementation at various settings for KARATSUBA_LEN, and it figures out what the optimal KARATSUBA_LEN is for the machine that it is running on.

Package maintainers can use this script, when creating a package for this bc, to figure out what is optimal for their users. Then they don't have to run it ever again. So this script only has to run on the packagers machine.

I tried to write the script in sh, by the way, and I finally accepted the tradeoff of using Python 3 when it became too hard.

However, I also mentioned that it's for testing Karatsuba with various settings of KARATSUBA_LEN. Package maintainers will want to run the test suite, right?

Yes, but this script is not part of the test suite; it's used for testing in the scripts/release.sh script, which is maintainer use only.

However, there is one snare with karatsuba.py: I didn't want the user to have to install any Python libraries to run it. Keep that in mind if you change it.

`link.sh`

This script is the magic behind making dc a symlink of bc when both calculators are built.

`locale_install.sh`

This script does what its name says: it installs locales.

It turns out that this is complicated.

There is a magic environment variable, $NLSPATH, that tells you how and where you are supposed to install locales.

Yes, how. And where.

But now is not the place to rant about $NLSPATH. For more information on locales and $NLSPATH, see Locales.

`locale_uninstall.sh`

This script does what its name says: it uninstalls locales.

This is far less complicated than installing locales. I basically generate a wildcard path and then list all paths that fit that wildcard. Then I delete each one of those paths. Easy.

For more information on locales, see Locales.

`manpage.sh`

This script is the one that generates markdown manuals from a template and a manpage from a markdown manual.

For more information about generating manuals, see Manuals.

`ministat.c`

This is a file copied from FreeBSD that calculates the standard statistical numbers, such as mean, average, and median, based on numbers obtained from a file.

For more information, see the FreeBSD ministat(1) manpage.

This file allows bc to build the scripts/ministat executable using the command make ministat, and this executable helps programmers evaluate the results of benchmarks more accurately.

`package.sh`

This script is what helps bc maintainers cut a release. It does the following:

Creates the appropriate git tag.
Pushes the git tag.
Copies the repo to a temp directory.
Removes files that should not be included in source distributions.
Creates the tarballs.
Signs the tarballs.
Zips and signs the Windows executables if they exist.
Calculates and outputs SHA512 and SHA256 sums for all of the files, including the signatures.

This script is for bc maintainers to use when cutting a release. It is not meant for outside use. This means that some non-POSIX utilities can be used, such as git and gpg.

In addition, before using this script, it expects that the folders that Windows generated when building bc, dc, and bcl, are in the parent directory of the repo, exactly as Windows generated them. If they are not there, then it will not zip and sign, nor calculate sums of, the Windows executables.

Because this script creates a tag and pushes it, it should only be run ONCE per release.

`radamsa.sh`

A script to test bc's command-line expression parsing code, which, while simple, strives to handle as much as possible.

What this script does is it uses the test cases in radamsa.txt an input to the Radamsa fuzzer.

For more information, see the Radamsa section.

`radamsa.txt`

Initial test cases for the radamsa.sh script.

`randmath.py`

This script generates random math problems and checks that bc's and dc's output matches the GNU bc and dc. (For this reason, it is necessary to have GNU bc and dc installed before using this script.)

One snare: be sure that this script is using the GNU bc and dc, not a previously-installed version of this bc and dc.

If you want to check for memory issues or failing asserts, you can build the bc using ./scripts/fuzz_prep.sh -a, and then run it under this script. Any errors or crashes should be caught by the script and given to the user as part of the "checklist" (see below).

The basic idea behind this script is that it generates as many math problems as it can, biasing towards situations that may be likely to have bugs, and testing each math problem against GNU bc or dc.

If GNU bc or dc fails, it just continues. If this bc or dc fails, it stores that problem. If the output mismatches, it also stores the problem.

Then, when the user sends a SIGINT, the script stops testing and goes into report mode. One-by-one, it will go through the "checklist," the list of failed problems, and present each problem to the user, as well as whether this bc or dc crashed, and its output versus GNU. Then the user can decide to add them as test cases, which it does automatically to the appropriate test file.

`release_settings.txt`

A text file of settings combinations that release.sh uses to ensure that bc and dc build and work with various default settings. release.sh simply reads it line by line and uses each line for one build.

`release.sh`

This script is for bc maintainers only. It runs bc, dc, and bcl through a gauntlet that is mostly meant to be used in preparation for a release.

It does the following:

Builds every build type, with every setting combo in release_settings.txt with both calculators, bc alone, and dc alone.
Builds every build type, with every setting combo in release_settings.txt with both calculators, bc alone, and dc alone for 32-bit.
Does #1 and #2 for Debug, Release, Release with Debug Info, and Min Size Release builds.
Runs the test suite on every build, if desired.
Runs the test suite under ASan, UBSan, and MSan for every build type/setting combo.
Runs scripts/karatsuba.py in test mode.
Runs the test suite for both calculators, bc alone, and dc alone under valgrind and errors if there are any memory bugs or memory leaks.

`safe-install.sh`

A script copied from musl to atomically install files.

`test_settings.sh`

A quick and dirty script to help automate rebuilding while manually testing the various default settings.

This script uses test_settings.txt to generate the various settings combos.

For more information about settings, see Settings in the build manual.

`test_settings.txt`

A list of the various settings combos to be used by test_settings.sh.

`src/`

This folder is, obviously, where the actual heart and soul of bc, the source code, is.

All of the source files are in one folder; this simplifies the build system immensely.

There are separate files for bc and dc specific code (bc.c, bc_lex.c, bc_parse.c, dc.c, dc_lex.c, and dc_parse.c) where possible because it is cleaner to exclude an entire source file from a build than to have #if/#endif preprocessor guards.

That said, it was easier in many cases to use preprocessor macros where both calculators used much of the same code and data structures, so there is a liberal sprinkling of them through the code.

`args.c`

Code for processing command-line arguments.

The header for this file is include/args.h.

`bc.c`

The code for the bc main function bc_main().

The header for this file is include/bc.h.

`bc_lex.c`

The code for lexing that only bc needs.

The headers for this file are include/lex.h and include/bc.h.

`bc_parse.c`

The code for parsing that only bc needs. This code is the most complex and subtle in the entire codebase.

The headers for this file are include/parse.h and include/bc.h.

`data.c`

Due to historical accident because of a desire to get my bc into toybox, all of the constant data that bc needs is all in one file. This is that file.

There is no code in this file, but a lot of the const data has a heavy influence on code, including the order of data in arrays because that order has to correspond to the order of other things elsewhere in the codebase. If you change the order of something in this file, run make test, and get errors, you changed something that depends on the order that you messed up.

Almost all headers have extern references to items in this file.

`dc.c`

The code for the dc main function dc_main().

The header for this file is include/dc.h.

`dc_lex.c`

The code for lexing that only dc needs.

The headers for this file are include/lex.h and include/dc.h.

`dc_parse.c`

The code for parsing that only dc needs.

The headers for this file are include/parse.h and include/bc.h.

`file.c`

The code for bc's implementation of buffered I/O. For more information about why I implemented my own buffered I/O, see include/file.h, Error Handling, and Custom I/O, along with status.h and the notes about version 3.0.0 in the NEWS.

The header for this file is include/file.h.

`history.c`

The code for bc's implementation of command-line editing/history, which is based on a UTF-8-aware fork of linenoise.

For more information, see the Command-Line History section.

The header for this file is include/history.h.

`lang.c`

The data structures used for actual execution of bc and dc code.

While execution is done in src/program.c, this file defines functions for initializing, copying, and freeing the data structures, which is somewhat orthogonal to actual execution.

Yes, it's misnamed; that's an accident of history where the first things I put into it all seemed related to the bc language.

The header for this file is include/lang.h.

`lex.c`

The code for the common things that both programs need for lexing.

The header for this file is include/lex.h.

`library.c`

The code to implement the public API of the bcl library.

The code in this file does a lot to ensure that clients do not have to worry about internal bc details, especially error handling with setjmp() and longjmp(). That and encapsulating the handling of numbers are the bulk of what the code in this file actually does because most of the library is still implemented in src/num.c.

The headers for this file are include/bcl.h and include/library.h.

`main.c`

The entry point for both programs; this is the main() function.

This file has no headers associated with it.

`num.c`

The code for all of the arbitrary-precision numbers and math in bc.

The header for this file is include/num.h.

`opt.c`

The code for parsing command-line options.

The header for this file is include/opt.h.

`parse.c`

The code for the common items that both programs need for parsing.

The header for this file is include/parse.h.

`program.c`

The code for the actual execution engine for bc and dc code.

The header for this file is include/program.h.

`rand.c`

The code for the pseudo-random number generator (PRNG) and the special stack handling it needs.

The PRNG only generates fixed-size integers. The magic of generating random numbers of arbitrary size is actually given to the code that does math (src/num.c).

The header for this file is include/rand.h.

`read.c`

The code for reading from files and stdin.

The header for this file is include/read.h.

`vector.c`

The code for vectors, maps, and slab vectors, along with slabs.

The header for this file is include/vector.h.

`vm.c`

The code for setting up and running bc and dc.

It is so named because I think of it as the "virtual machine" of bc, though that is probably not true as program.h is probably the "virtual machine" code. Thus, the name is more historical accident.

The header for this file is include/vm.h.

`tests/`

This directory contains the entire test suite and its infrastructure.

`all.sh`

A convenience script for the make run_all_tests target (see the Group Tests section for more information).

`all.txt`

The file with the names of the calculators. This is to make it easier for the test scripts to know where the standard and other test directories are.

`bcl.c`

The test for the bcl API. For more information, see the bcl Test section.

`error.sh`

The script to run the file-based error tests in tests/<calculator>/errors/ for each calculator. For more information, see the Error Tests section.

This is a separate script so that each error file can be run separately and in parallel.

`errors.sh`

The script to run the line-based error tests in tests/<calculator>/errors.txt for each calculator. For more information, see the Error Tests section.

`extra_required.txt`

The file with the list of tests which both calculators have that need the Extra Math build option. This exists to make it easy for test scripts to skip those tests when the Extra Math build option is disabled.

`history.py`

The file with all of the history tests. For more information, see the History Tests section.

`history.sh`

The script to integrate history.py into the build system in a portable way, and to skip it if necessary.

This script also re-runs the test three times if it fails. This is because pexpect can be flaky at times.

`other.sh`

The script to run the "other" (miscellaneous) tests for each calculator. For more information, see the Other Tests section.

`read.sh`

The script to run the read tests for each calculator. For more information, see the read() Tests section.

`script.sed`

The sed script to edit the output of GNU bc when generating script tests. For more information, see the Script Tests section.

`script.sh`

The script for running one script test. For more information, see the Script Tests section.

`scripts.sh`

The script to help the make run_all_tests (see the Group Tests section) run all of the script tests.

`stdin.sh`

The script to run the stdin tests for each calculator. For more information, see the stdin Tests section.

`test.sh`

The script to run one standard test. For more information, see the Standard Tests section.

`bc/`

The standard tests directory for bc. For more information, see the Standard Tests section.

`all.txt`

The file to tell the build system and make run_all_tests (see the Group Tests section) what standard tests to run for bc, as well as in what order.

This file just lists the test names, one per line.

`errors.txt`

The initial error test file for bc. This file has one test per line. See the Error Tests section for more information.

`posix_errors.txt`

The file of tests for POSIX compatibility for bc. This file has one test per line. For more information, see the Error Tests section.

`timeconst.sh`

The script to run the bc tests that use the Linux timeconst.bc script. For more information, see the Linux timeconst.bc Scriptsection.

`errors/`

The directory with error tests for bc, most discovered by AFL++ (see the Fuzzing section). There is one test per file. For more information, see the Error Tests section.

`scripts/`

The script tests directory for bc. For more information, see the Script Tests section.

`all.txt`

A file to tell the build system and make run_all_tests (see the Group Tests section) what script tests to run for bc, as well as in what order.

This file just lists the test names, one per line.

`dc/`

The standard tests directory for dc. For more information, see the Standard Tests section.

`all.txt`

The file to tell the build system and make run_all_tests (see the Group Tests section) what standard tests to run for dc, as well as in what order.

This file just lists the test names, one per line.

`errors.txt`

The initial error test file for dc. This file has one test per line. See the Error Tests section for more information.

`read_errors.txt`

The file of tests errors with the ? command (read() in bc). This file has one test per line. See the Error Tests section for more information.

`errors/`

The directory with error tests for dc, most discovered by AFL++ (see the Fuzzing section). There is one test per file. For more information, see the Error Tests section.

`scripts/`

The script tests directory for dc. For more information, see the Script Tests section.

`all.txt`

The file to tell the build system and make run_all_tests (see the Group Tests section) what script tests to run for dc, as well as in what order.

This file just lists the test names, one per line.

`fuzzing/`

The directory containing the fuzzing infrastructure. For more information, see the Fuzzing section.

`bc_afl_continue.yaml`

The tmuxp config (for use with tmux) for easily restarting a fuzz run. For more information, see the Convenience subsection of the Fuzzing section.

`bc_afl.yaml`

The tmuxp config (for use with tmux) for easily starting a fuzz run. For more information, see the Convenience subsection of the Fuzzing section.

Be aware that this will delete all previous unsaved fuzzing tests in the output directories.

`bc_inputs1/`

The fuzzing input directory for the first third of inputs for bc. For more information, see the Corpuses subsection of the Fuzzing section.

`bc_inputs2/`

The fuzzing input directory for the second third of inputs for bc. For more information, see the Corpuses subsection of the Fuzzing section.

`bc_inputs3/`

The fuzzing input directory for the third third of inputs for bc. For more information, see the Corpuses subsection of the Fuzzing section.

`dc_inputs/`

The fuzzing input directory for the inputs for dc. For more information, see the Corpuses subsection of the Fuzzing section.

`vs/`

The directory containing all of the materials needed to build bc, dc, and bcl on Windows.

`bcl.sln`

A Visual Studio solution file for bcl. This, along with bcl.vcxproj and bcl.vcxproj.filters is what makes it possible to build bcl on Windows.

`bcl.vcxproj`

A Visual Studio project file for bcl. This, along with bcl.sln and bcl.vcxproj.filters is what makes it possible to build bcl on Windows.

`bcl.vcxproj.filters`

A Visual Studio filters file for bcl. This, along with bcl.sln and bcl.vcxproj is what makes it possible to build bcl on Windows.

`bc.sln`

A Visual Studio solution file for bc. This, along with bc.vcxproj and bc.vcxproj.filters is what makes it possible to build bc on Windows.

`bc.vcxproj`

A Visual Studio project file for bc. This, along with bc.sln and bc.vcxproj.filters is what makes it possible to build bc on Windows.

`bc.vcxproj.filters`

A Visual Studio filters file for bc. This, along with bc.sln and bc.vcxproj is what makes it possible to build bc on Windows.

`tests/`

A directory of files to run tests on Windows.

`tests_bc.bat`

A file to run basic bc tests on Windows. It expects that it will be run from the directory containing it, and it also expects a bc.exe in the same directory.

`tests_dc.bat`

A file to run basic dc tests on Windows. It expects that it will be run from the directory containing it, and it also expects a bc.exe in the same directory.

Build System

The build system is described in detail in the build manual, so maintainers should start there. This section, however, describes some parts of the build system that only maintainers will care about.

Clean Targets

bc has a default make clean target that cleans up the build files. However, because bc's build system can generate many different types of files, there are other clean targets that may be useful:

make clean_gen cleans the gen/strgen executable generated from gen/strgen.c. It has no prerequisites.
make clean cleans object files, *.cat files (see the Locales section), executables, and files generated from text files in gen/, including gen/strgen if it was built. So this has a prerequisite on make clean_gen in normal use.
make clean_benchmarks cleans benchmarks, including the ministat executable. It has no prerequisites.
make clean_config cleans the generated Makefile and the manuals that configure.sh copied in preparation for install. It also depends on make clean and make clean_benchmarks, so it cleans those items too. This is the target that configure.sh uses before it does its work.
make clean_coverage cleans the generated coverage files for the test suite's code coverage capabilities. It has no prerequisites. This is useful if the code coverage tools are giving errors.
make clean_tests cleans everything. It has prerequisites on all previous clean targets, but it also cleans all of the generated tests.

When adding more generated files, you may need to add them to one of these targets and/or add a target for them especially.

Preprocessor Macros

bc and dc use a lot of preprocessor macros to ensure that each build type:

builds,
works under the test suite, and
excludes as much code as possible from all builds.

This section will explain the preprocessor style of bc and dc, as well as provide an explanation of the macros used.

Style

The style of macro use in bc is pretty straightforward: I avoid depending on macro definitions and instead, I set defaults if the macro is not defined and then test the value if the macro with a plain #if.

(Some examples of setting defaults are in include/status.h, just above the definition of the BcStatus enum.)

In other words, I use #if instead of #ifndef or #ifdef, where possible.

There are a couple of cases where I went with standard stuff instead.

Standard Macros

BC_ENABLED

: This macro expands to 1 if bc is enabled, 0 if disabled.

DC_ENABLED

: This macro expands to 1 if dc is enabled, 0 if disabled.

BUILD_TYPE

: The macro expands to the build type, which is one of: A, E, H, N, EH, EN, HN, EHN. This build type is used in the help text to direct the user to the correct markdown manual in the git.gavinhoward.com website.

EXECPREFIX

: This macro expands to the prefix on the executable name. This is used to allow bc and dc to skip the prefix when finding out which calculator is executing.

BC_NUM_KARATSUBA_LEN

: This macro expands to an integer, which is the length of numbers below which the Karatsuba multiplication algorithm switches to brute-force multiplication.

BC_ENABLE_EXTRA_MATH

: This macro expands to 1 if the Extra Math build option is enabled, 0 if disabled.

BC_ENABLE_HISTORY

: This macro expands to 1 if the History build option is enabled, 0 if disabled.

BC_ENABLE_NLS

: This macro expands to 1 if the NLS build option (for locales) is enabled, 0 if disabled.

BC_ENABLE_LIBRARY

: This macro expands to 1 if the bcl library is enabled, 0 if disabled. If this is enabled, building the calculators themselves is disabled, but both BC_ENABLED and DC_ENABLED must be non-zero.

BC_ENABLE_MEMCHECK

: This macro expands to 1 if bc has been built for use with Valgrind's Memcheck, 0 otherwise. This ensures that fatal errors still free all of their memory when exiting. bc does not do that normally because what's the point?

BC_ENABLE_AFL

: This macro expands to 1 if bc has been built for fuzzing with AFL++, 0 otherwise. See the Fuzzing section for more information.

BC_DEFAULT_BANNER

: This macro expands to the default value for displaying the bc banner.

BC_DEFAULT_SIGINT_RESET

: The macro expands to the default value for whether or not bc should reset on SIGINT or quit.

BC_DEFAULT_TTY_MODE

: The macro expands to the default value for whether or not bc should use TTY mode when it available.

BC_DEFAULT_PROMPT

: This macro expands to the default value for whether or not bc should use a prompt when TTY mode is available.

DC_DEFAULT_SIGINT_RESET

: The macro expands to the default value for whether or not dc should reset on SIGINT or quit.

DC_DEFAULT_TTY_MODE

: The macro expands to the default value for whether or not dc should use TTY mode when it available.

DC_DEFAULT_PROMPT

: This macro expands to the default value for whether or not dc should use a prompt when TTY mode is available.

BC_DEBUG_CODE

: If this macro expands to a non-zero integer, then bc is built with a lot of extra debugging code. This is never set by the build system and must be set by the programmer manually. This should never be set in builds given to end users. For more information, see the Debugging section.

Test Suite

While the source code may be the heart and soul of bc, the test suite is the arms and legs: it gives bc the power to do anything it needs to do.

The test suite is what allowed bc to climb to such high heights of quality. This even goes for fuzzing because fuzzing depends on the test suite for its input corpuses. (See the Fuzzing section.)

Understanding how the test suite works should be, I think, the first thing that maintainers learn after learning what bc and dc should do. This is because the test suite, properly used, gives confidence that changes have not caused bugs or regressions.

That is why I spent the time to make the test suite as easy to use and as fast as possible.

To use the test suite (assuming bc and/or dc are already built), run the following command:

make test

That's it. That's all.

It will return an error code if the test suite failed. It will also print out information about the failure.

If you want the test suite to go fast, then run the following command:

make -j<cores> test

Where <cores> is the number of cores that your computer has. Of course, this requires a make implementation that supports that option, but most do. (And I will use this convention throughout the rest of this section.)

I have even tried as much as possible, to put longer-running tests near the beginning of the run so that the entire suite runs as fast as possible.

However, if you want to be sure which test is failing, then running a bare make test is a great way to do that.

But enough about how you have no excuses to use the test suite as much as possible; let's talk about how it works and what you can do with it.

Standard Tests

The heavy lifting of testing the math in bc, as well as basic scripting, is done by the "standard tests" for each calculator.

These tests use the files in the tests/bc/ and tests/dc/ directories (except for tests/bc/all.txt, tests/bc/errors.txt, tests/bc/posix_errors.txt, tests/bc/timeconst.sh, tests/dc/all.txt, tests/dc/errors.txt, and tests/dc/read_errors.txt), which are called the "standard test directories."

For every test, there is the test file and the results file. The test files have names of the form <test>.txt, where <test> is the name of the test, and the results files have names of the form <test>_results.txt.

If the test file exists but the results file does not, the results for that test are generated by a GNU-compatible bc or dc. See the Generated Tests section.

The all.txt file in each standard tests directory is what tells the test suite and build system what tests there are, and the tests are either run in that order, or in the case of parallel make, that is the order that the targets are listed as prerequisites of make test.

If the test exists in the all.txt file but does not actually exist, the test and its results are generated by a GNU-compatible bc or dc. See the Generated Tests section.

To add a non-generated standard test, do the following:

Add the test file (<test>.txt in the standard tests directory).
Add the results file (<test>_results.txt in the standard tests directory). You can skip this step if just the results file needs to be generated. See the Generated Tests section for more information.
Add the name of the test to the all.txt file in the standard tests directory, putting it in the order it should be in. If possible, I would put longer tests near the beginning because they will start running earlier with parallel make. I always keep decimal first, though, as a smoke test.

If you need to add a generated standard test, see the Generated Tests section for how to do that.

Some standard tests need to be skipped in certain cases. That is handled by the build system. See the Integration with the Build System section for more details.

In addition to all of the above, the standard test directory is not only the directory for the standard tests of each calculator, it is also the parent directory of all other test directories for each calculator.

`bc` Standard Tests

The list of current (27 February 2023) standard tests for bc is below:

decimal

: Tests decimal parsing and printing.

print

: Tests printing in every base from decimal. This is near the top for performance of parallel testing.

parse

: Tests parsing in any base and outputting in decimal. This is near the top for performance of parallel testing.

lib2

: Tests the extended math library. This is near the top for performance of parallel testing.

print2

: Tests printing at the extreme values of obase.

length

: Tests the length() builtin function.

scale

: Tests the scale() builtin function.

shift

: Tests the left (<<) and right (>>) shift operators.

add

: Tests addition.

subtract

: Tests subtraction.

multiply

: Tests multiplication.

divide

: Tests division.

modulus

: Tests modulus.

power

: Tests power (exponentiation).

sqrt

: Tests the sqrt() (square root) builtin function.

trunc

: Tests the truncation ($) operator.

places

: Tests the places (@) operator.

vars

: Tests some usage of variables. This one came from AFL++ I think.

boolean

: Tests boolean operators.

comp

: Tests comparison operators.

abs

: Tests the abs() builtin function.

assignments

: Tests assignment operators, including increment/decrement operators.

functions

: Tests functions, specifically function parameters being replaced before they themselves are used. See the comment in bc_program_call() about the last condition.

scientific

: Tests scientific notation.

engineering

: Tests engineering notation.

globals

: Tests that assigning to globals affects callers.

strings

: Tests strings.

strings2

: Tests string allocation in slabs, to ensure slabs work.

letters

: Tests single and double letter numbers to ensure they behave differently. Single-letter numbers always be set to the same value, regardless of ibase.

exponent

: Tests the e() function in the math library.

log

: Tests the l() function in the math library.

pi

: Tests that bc produces the right value of pi for numbers with varying scale values.

arctangent

: Tests the a() function in the math library.

sine

: Tests the s() function in the math library.

cosine

: Tests the c() function in the math library.

bessel

: Tests the j() function in the math library.

fib

: Tests the fib() Fibonacci function in the extended math library.

arrays

: Test arrays.

misc

: Miscellaneous tests. I named it this because at the time, I struggled to classify them, but it's really testing multi-line numbers.

misc1