Skip to content

Writing a custom branch script

Jared Ketterer edited this page Sep 27, 2022 · 12 revisions

How bsp_tool loads .bsps

If you're using bsp_tool.load_bsp("reallycool.bsp") to load a .bsp a few things happen behind the scenes to figure out the format
Since bsp_tool supports a range of .bsp variants, a single script to handle the rough format wasn't going to cut it
To narrow down the exact format of a bsp file load_bsp looks at some key information in each file:

Developer variants

First, load_bsp tries to determine the developer behind the chosen .bsp
If the file extension is .d3dbsp, it's a Call of Duty 2 or 4 D3DBsp
Other bsps use the .bsp extension (Call of Duty 1 included)
The developer is identified from the "file-magic", the first four bytes of any .bsp are:

  • b"IBSP" for IdTechBsp Id Software
  • b"IBSP" for InfinityWardBsp Infinity Ward
  • b"RBSP" for RavenBsp Raven Software
  • b"rBSP" for RespawnBsp Respawn Entertainment
  • b"VBSP" for ValveBsp Valve Software

Most of the major differences between each developer's format are the number of lumps & bsp header
They also use some lumps which are unique to each developer's Quake based engine
More on those differences in an upcoming wiki page...

Game variants

Once load_bsp knows the developer, it has to work out which game a .bsp comes from

In the .bsp header there will always be a 4 byte int for the .bsp format version
Unfortunately this isn't totally unique from game to game, most Source Engine titles use version 20
This is where load_bsp's second (optional!) argument comes in, branch

branch can be either a string or a python script

>>> import bsp_tool

>>> bsp_tool.load_bsp("tests/maps/pl_upward.bsp", branch=bsp_tool.branches.valve.orange_box)
Loading pl_upward.bsp (VBSP version 20)...
Loaded  pl_upward.bsp
<ValveBsp pl_upward.bsp at 0x000001FB329F7640>

>>> bsp_tool.load_bsp("tests/maps/test2.bsp")
Loading test2.bsp (VBSP version 20)...
Loaded  test2.bsp
<ValveBsp test2.bsp at 0x000001FB329F7940>

In the above example bsp_tool.branches.valve.orange_box points to bsp_tool/branches/valve/orange_box.py
This branch script is used to intialise the Bsp subclass chosen when load_bsp works out the developer

When branch is a string, load_bsp uses branch as a key in the bsp_tool.branches.by_name dictionary to choose a script
Bsp classes take the branch script as their first argument and do not have defaults (except ValveBsp)

When branch is "unknown" (default) the bsp format version is used as a key in the bsp_tool.branches.by_version dictionary

Branch scripts

Now that we know a branch script is needed to load a specific .bsp variant, why might we need to make one?
Well, bsp_tool doesn't cover a lot of formats, and those it does aren't mapped completely either!

But with branch scripts you can develop a rough map of a particular format while copying definitions from other scripts
nexon/vindictus.py for example, imports valve.orange_box and copies most of the format
This saves a lot of code! Especially since they only differ on the format of a handful of lumps and share a .bsp version

Overall structure

The branch scripts that come with bsp_tool have a common format to make reading them as consistent as possible

import enum
from .. import base
from .. import shared  # special lumps

BSP_VERSION = 20

class LUMP(enum.Enum):
    ENTITIES = 0
    AREAS = 20
    PAKFILE = 40


class LumpHeader(base.MappedArray):
    _mapping = [“offset”, “length”, “version”, “fourCC”]
    _format =4I# classes for each lump, in alphabetical order:
class Area(base.Struct):  # LUMP 20
    num_area_portals: int   # number of AreaPortals after first_area_portal in this Area
    first_area_portal: int  # index of first AreaPortal
    __slots__ = ["num_area_portals", "first_area_portal"]
    _format = "2i"

LUMP_CLASSES = {"AREAS": Area}

SPECIAL_LUMP_CLASSES = {"ENTITIES": shared.Entities,
                        "PAKFILE": shared.PakFile}

If you compare bsp_tool/branches/valve/orange_box.py you'll see I've left a lot out here, but this basic batch script is a great start for translating any .bsp variant

At the top we have the bsp format version, mostly as a note
Next comes the LUMP enums, these list each lump in the order they appear in the bsp header

If you don't list a lump it won't be imported
Listing UNKNOWN_23 & UNUNSED_63 in LUMP will save their contents as RAW_LUMPNAME (unless they're empty)

Attached to this we have lump_header_address, this connects each LUMP entry to the offset .bsp where it's header begins Then comes the lump classes, these translate most lumps into python objects (more on them later)
We also have some special lump classes, these are loaded in a different way to other lumps, and some are shared across almost all bsp variants

The Bsp class reads the headers for each lump and holds the contents in Bsp.HEADERS
This dictionary of headers takes the name given in the branch scripts' LUMP class From there, a lump is either saved as Bsp.RAW_LUMPNAME (bytes) or Bsp.LUMPNAME (List[LumpClass]) if it the lump is listed in LUMP_CLASSES

Lump classes

A majority of lumps are very simple, being a list of fixed length structs
bsp_tool loads these lumps with python's built in struct module
struct.iter_unpack takes a format specifier string and a stream of bytes
This stream of bytes must contain a whole number of these structures or an error will be raised

The lump class in the example is a subclass of bsp_tool.branches.base.Struct
base.Struct exists to make defining a lump's specific format quick using very little code

The definition usually has 3 parts:

class LumpClass(base.Struct):
    __slots__ = ["origin", "unknown", "flags"]
    _format = "3f3i"
    _arrays = {"origin": [*"xyz"], "unknown": 2}

__slots__ names the main attributes of the LumpClass
_format holds the format string for struct.iter_unpack
(I recommend also giving type hints for each attribute, so others don't have to work them out from _format)
_arrays is optional, it holds a dictionary for generating a base.MappedArray to help group attributes For the most complex use of arrays (so far), see: branches.id_software.quake3.Face

So the above example would turn the C struct:

struct LumpClass {
    struct { float x, y, z; } origin;
    int  unknown[2];
    int  flags;
}

into:

LumpClass.origin.x
LumpClass.origin.y
LumpClass.origin.z
LumpClass.unknown[0]
LumpClass.unknown[1]
LumpClass.flags

Lump classes don't have to be subclasses of base.Struct though, the only requirement is the _format attribute
This is essential because each lump class is initialised with the tuple struct.iter_unpack returns for each struct
And to read these raw bytes Bsp.load_lumps uses something similar to struct.iter_unpack(LumpClass._format, RAW_LUMP)
If the tuple returned has a length of 0 bsp.LUMP = list(map(LumpClass, [t[0] for t in tuples]))
Else: Bsp.LUMP = list(map(LumpClass, tuples))
To support re-saving LumpClasses, a .flat() method is required, which must return a tuple near identical to the one it was made from (same types)

Special lump classes

Not all lumps are as simple as a list of structs, and this is where special lump classes come in
Special lump classes are initialised from the raw bytes of a lump, turning them into python objects that are easier to work with All that's really required is an __init__ method and an .as_bytes() method for re-saving

Here's branches.shared.TexDataStringData as an example of how basic a special lump class can be:

class TexDataStringData(list):
    def __init__(self, raw_texdata_string_data):
        super().__init__([t.decode("ascii", errors="ignore") for t in raw_texdata_string_data.split(b"\0")])

    def as_bytes(self):
        return b"\0".join([t.encode("ascii") for t in self]) + b"\0"

By inheriting list you can use all the features of python lists while still importing the data with __init__ & saving it back with .as_bytes()
You can of course make more complex classes, like adding methods (though they won't be connected to their parent Bsp) Speaking of methods

Methods

While not listed in the example branch scripts, you can add methods to a Bsp with a branch script! The only requirements are that you have a list of functions in methods somewhere in the script

def areaportals_of_area(bsp, area_index):
    area = bsp.AREAS[area_index]
    return bsp.AREA_PORTALS[area.first_areaportal:area.first_areaportal + area.num_areaportals]


methods = [areaportals_of_area]

These methods are attached when the Bsp is initialised The only requirements for these functions is that the first argument be bsp, since as a method the Bsp will pass itself to the function