Releases: open-power/skiboot
skiboot 5.4.0 Release Candidate 1
skiboot-5.4.0-rc1
skiboot-5.4.0-rc1 was released on Monday October 17th 2016. It is the first release candidate of skiboot 5.4, which will become the new stable release of skiboot following the 5.3 release, first released August 2nd 2016.
skiboot-5.4.0-rc1 contains all bug fixes as of :ref:skiboot-5.3.7
and :ref:skiboot-5.1.18
(the currently maintained stable releases).
For how the skiboot stable releases work, see :ref:stable-rules
for details.
The current plan is to release a new release candidate every week until we feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which is due by November 23rd 2016.
Over skiboot-5.3, we have the following changes:
New Features
Initial Trusted Boot support (see :ref:`stb-overview`). There are several limitations with this initial release:
CAPP partition is not measured correctly
Only Nuvoton TPM 2.0 is supported
Requires hardware rework on late revision Habanero or Firestone boards in order to install TPM.
Add i2c Nuvoton TPM 2.0 Driver
romcode driver for POWER8 secure ROM
See Device tree docs for tpm and ibm,secureboot nodes
See main secure and trusted boot documentation.
Fast reboot for P8
This makes reboot take an awful lot less time, somewhere between four and ten times faster than a full IPL. It is currently experimental and not enabled by default. You can enable the experimental support via nvram option:
# nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky
WARNING: This has known bugs. For example, if you have used a device in CAPI mode, we will currently NOT reset it back to plain PCI. There are also some known issues in most simulators.
Support ibm,skiboot NVRAM partition with skiboot configuration options.
These should generally only be used if you either completely know what you are doing or need to work around a skiboot bug. They are not intended for end users.
Add support for supplying the kernel boot arguments from the bootargs configuration string in the ibm,skiboot NVRAM partition.
Enabling the experimental fast reset feature is done via this method.
Add support for nap mode on P8 while in skiboot
While nap has been exposed to the Operating System since day 1, we have not utilized low power states when in skiboot itself, leading to higher power consumption during boot. We only enable the functionality after the 0x100 vector has been patched, and we disable it before transferring control to Linux.
libflash: add 128MB MX66L1G45G part
Pointer validation of OPAL API call arguments.
If the kernel called an OPAL API with vmalloc'd address or any other address range in real mode, we would hit a problem with aliasing. Since the top 4 bits are ignored in real mode, pointers from 0xc.. and 0xd.. (and other ranges) could collide and lead to hard to solve bugs. This patch adds the infrastructure for pointer validation and a simple test case for testing the API
The checks validate pointers sent in using opal_addr_valid()
Documentation
There have been a number of documentation fixes this release. Most prominent is the switch to Sphinx (from the Python project) and ReStructured Text (RST) as the documentation format. RST and Sphinx enable both production of pretty documentation in HTML and PDF formats while remaining readable in their raw form to those with no knowledge of RST.
You can build a HTML site by doing the following:
cd doc/
make html
As always, documentation patches are very, very welcome as we attempt to document the OPAL API, the device tree bindings and important parts of OPAL internals.
We would like the Device Tree documentation to follow the style that can be included in the Device Tree Specification.
General
Make console-log time more readable: seconds rather than timebase Log format is now [SECONDS.(tb%512000000),LEVEL]
Flash (PNOR) code improvements
flash: Make size 64 bit safe This makes the size of flash 64 bit safe so that we can have flash devices greater than 4GB. This is especially useful for mambo disks passed through to Linux.
core/flash.c: load actual partition size We are downloading 0x20000 bytes from PNOR for CAPP, but currently the CAPP lid is only 40K.
flash: Rework error paths and messages for multiple flash controllers Now that we have mambo bogusdisk flash, we can have many flash chips. This is resulting in some confusing output messages.
core/init: Fix "failure of getting node in the free list" warning on boot.
slw: improve error message for SLW timer stuck
Centaur / XSCOM error handling
print message on disabling xscoms to centaur due to many errors
Mark centaur offline after 10 consecutive access errors
XSCOM improvements
xscom: Map all HMER status codes to OPAL errors
xscom: Initialize the data to a known value in xscom_read In case of error, don't leave the data random. It helps debugging when the user fails to check the error code. This happens due to a bug in the PRD wrapper app.
chip: Add a quirk for when core direct control XSCOMs are missing
p8-i2c: Don't crash if a centaur errored out
cpu: Make endian switch message more informative
cpu: Display number of started CPUs during boot
core/init: ensure that HRMOR is zero at boot
asm: Fix backtrace for unexpected exception
cpu: Remove pollers calling heuristics from cpu_wait_job This will be handled by time_wait_ms(). Also remove a useless smt_medium(). Note that this introduce a difference in behaviour: time_wait will only call the pollers on the boot CPU while cpu_wait_job() could call them on any. However, I can't think of a case where this is a problem.
cpu: Remove global job queue Instead, target a specific CPU for a global job at queuing time. This will allow us to wake up the target using an interrupt when implementing nap mode. The algorithm used is to look for idle primary threads first, then idle secondaries, and finally the less loaded thread. If nothing can be found, we fallback to a synchronous call.
lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing
lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired
interrupts: Add new source ->attributes() callback
This allows a given source to provide per-interrupt attributes such as whether it targets OPAL or Linux and it's estimated frequency.
The former allows to get rid of the double set of ops used to decide which interrupts go where on some modules like the PHBs and the latter will be eventually used to implement smart caching of the source lookups.
opal/hmi: Fix a TOD HMI failure during a race condition.
platform: Add BT to Generic platform
NVRAM
Support ibm,skiboot partition for skiboot specific configuration options
flash: Size NVRAM based on ECC for OpenPOWER platforms
If NVRAM has ECC (as per the ffs header) then the actual size of the partition is less than reported by the ffs header in the PNOR then the actual size of the partition is less than reported by the ffs header.
NVLink/NPU
Fix reserved PE#
NPU bdfn allocation bugfix
Fix bad PE number check
NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}. A bad PE number check in npu_err_inject checks if the PE number is greater than 4 as a fail case, so it would wrongly perform operations on a non-existant PE 4.
Use PCI virtual device
assert the NPU irq min is aligned.
program NPU BUID reg properly
npu: reword "error" to indicate it's actually a warning
Incorrect FWTS annotation. Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings about NVLink not working on machines that aren't fully populated with GPUs.
external: NPU hardware procedure script
Performing NPU hardware procedures requires some config space magic. Put all that magic into a script, so you can just specify the target device and the procedure number.
PCI
Generic fixes
Claim surprise hotplug capability
Reserve PCI buses for RC's slot
Update PCI topology after power change
Return slot cached power state
Cache power state on slot without power control
Avoid hot resets at boot time
Fix initial PCIe slot power state
Print CRS retry times It's useful to know the CRS retry times before the PCI device is detected successfully. In PCI hot add case, it usually indicates time consumed for the adapter's firmware to be partially ready (responsive PCI config space).
core/pci: Fix the power-off timeout in pci_slot_power_off() The timeout should be 1000ms instead of 1000 ticks while powering off PCI slot in pci_slot_power_off(). Otherwise, it's likely to hit timeout powering off the PCI slot as below skiboot logs reveal:
[5399576870,5] PHB#0005:02:11.0 Timeout powering off slot
PHB3
Override root slot's prepare_link_change() with PHB's
Disable surprise link down event on PCI slots
Disable ECRC on Broadcom adapter behind PMC switch
astbmc platforms
Support dynamic PCI slot. We might insert a PCIe switch to PHB direct slot and the downstream ports of the PCIe switch supports PCI hotplug.
CAPI
hw/phb3: Update capi initialization sequence
The capi initialization sequence was revised in a circumvention document when a 'link down' error was converted from fatal to Endpoint Recoverable. Other, non-capi, register setup was corrected even before the initial open-source release of skiboot, but a few capi-related registers were not updated then, so this patch fixes it.
IPMI
core/ipmi: Set interrupt-parent property
This allows ipmi-opal to properly use the OPAL irqchip rather than falling back to the event interface in Linux.
Mambo Simulator
Helpers for POWER9 Mambo.
mambo: Advertise available RADIX page sizes
mambo: Add section for kernel command line boot args Users can set kernel command line boot arguments for Mambo in a tcl script.
mambo: add exception and qtrace helpers
external/mambo: Update skiboot.tcl to add p...
skiboot 5.3.7
skiboot-5.3.7
skiboot-5.3.7 was released on Wednesday October 12th, 2016.
This is the 8th stable release of skiboot 5.3, the new stable release of skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.7 replaces skiboot-5.3.6 as the current stable version. It contains a few bugfixes, including an important PCI bug fix that could cause some adapters to not be detected.
Over skiboot-5.3.6, the following fixes are included:
PCI:
pci: Avoid hot resets at boot time
In the PCI post-fundamental reset code, a hot reset is performed at the end. This is causing issues at boot time as a reset signal is being sent downstream before the links are up, which is causing issues on adapters behind switches. No errors result in skiboot, but the adapters are not usable in Linux as a result.
This patch fixes some adapters not being configurable in Linux on some systems. The issue was not present in skiboot 5.2.x.
core/pci: Fix the power-off timeout in pci_slot_power_off()
The timeout should be 1000ms instead of 1000 ticks while powering off PCI slot in pci_slot_power_off(). Otherwise, it's likely to hit timeout powering off the PCI slot as below skiboot logs reveal:
[47912590456,5] SkiBoot skiboot-5.3.6 starting... (snip) [5399532365,7] PHB#0005:02:11.0 Bus 0f..ff scanning... [5399540804,7] PHB#0005:02:11.0 No card in slot [5399576870,5] PHB#0005:02:11.0 Timeout powering off slot [5401431782,3] FIRENZE-PCI: Wrong state 00000000 on slot 8000000002880005
PRD:
occ/prd/opal-prd: Queue OCC_RESET event message to host in OpenPOWER
During an OCC reset cycle the system is forced to Psafe pstate. When OCC becomes active, the system has to be restored to its last pstate as requested by host. So host needs to be notified of OCC_RESET event or else system will continue to remian in Psafe state until host requests a new pstate after the OCC reset cycle.
opal-prd: Fix error code from scom_read & scom_write
Currently, we always return a zero value from scom_read & scom_write, so the HBRT implementation has no way of detecting errors during scom operations. This change uses the actual return value from the scom operation from the kernel instead.
opal-prd: Add get_interface_capabilities to host interfaces
We need a way to indicate behaviour changes & fixes in the prd interface, without requiring a major version bump.
This change introduces the get_interface_capabilities callback, returning a bitmask of capability flags, pertaining to 'sets' of capabilities. We currently return 0 for all.
IBM FSP Platforms:
platforms/firenze: Fix clock frequency dt property
platforms/firence: HDAT: Fix typo in nest-frequency property
NVLink:
hw/npu.c: Fix reserved PE#
Currently the reserved PE is set to NPU_NUM_OF_PES, which is one greater than the maximum PE resulting in the following kernel errors at boot:
[ 0.000000] pnv_ioda_reserve_pe: Invalid PE 4 on PHB#4 [ 0.000000] pnv_ioda_reserve_pe: Invalid PE 4 on PHB#5
Due to a HW errata PE#0 is already reserved in the kernel, so update the opal-reserved-pe device-tree property to match this.
skiboot 5.3.6
skiboot-5.3.6
skiboot-5.3.6 was released on Saturday September 17th, 2016.
This is the 7th stable release of skiboot 5.3, the new stable release of skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.6 replaces skiboot-5.3.5 as the current stable version. It contains one minor bug fix.
Over skiboot-5.3.5, the following fixes are included:
SLW: Actually print the register dump only to memory A fix in 5.3.5 was only partially correct, we still had the log priority incorrect for dumping of the SLW registers.
skiboot-5.3.5
skiboot-5.3.5 was released on Wednesday September 14th, 2016.
This is the 6th stable release of skiboot 5.3, the new stable release of
skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.5 replaces skiboot-5.3.4 as the current stable version. It contains
a couple of minor bug fixes: simply clarifying two error messages.
Over skiboot-5.3.4, the following fixes are included:
- centaur: print message on disabling xscoms to centaur due to many errors
- slw: improve error message for SLW timer stuck
We still register dump, but only to in memory console buffer by default.
skiboot-5.3.4
skiboot-5.3.4 was released on Tuesday September 13th, 2016.
This is the 5th stable release of skiboot 5.3, the new stable release of
skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.4 replaces skiboot-5.3.3 as the current stable version. It contains
a couple of bug fixes, specifically around failing XSCOMs.
Over skiboot-5.3.3, the following fixes are included:
- xscom: Initialize the data to a known value in xscom_read
In case of error, don't leave the data random. It helps debugging when
the user fails to check the error code. This happens due to a bug in the
PRD wrapper app. - xscom: Map all HMER status codes to OPAL errors
- centaur: Mark centaur offline after 10 consecutive access errors
This avoids spamming the logs when the centaur is dead and PRD
constantly tries to access it - nvlink: Fix bad PE number check in error inject code path (<= rather than <)
skiboot-5.3.3
skiboot-5.3.3 was released on Friday September 2nd, 2016.
This is the 4th stable release of skiboot 5.3, the new stable release of
skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.3 replaces skiboot-5.3.2 as the current stable version. It contains
two bug fixes for machines utilizing the NPU (i.e. Garrison)
Over skiboot-5.3.2, the following fixes are included:
- hw/npu: assert the NPU irq min is aligned.
- hw/npu: program NPU BUID reg properly
The NPU BUID register was incorrectly programmed resulting in npu
interrupt level 0 causing a PB_CENT_CRESP_ADDR_ERROR checkstop,
and irqs from npus in odd chips being aliased to and processed
as the interrupts from the corresponding npu on the even chips.
skiboot-5.3.2
skiboot-5.3.2 was released on Friday August 26th, 2016.
This is the 3rd stable release of skiboot 5.3, the new stable release of
skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.2 replaces skiboot-5.3.1 as the current stable version. It contains
a few minor bug fixes.
Over skiboot-5.3.1, the following fixes are included:
- opal/hmi: Fix a TOD HMI failure during a race condition.
Rare race condition which meant we wouldn't recover from TOD error - lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing
Only affects systems in manufacturing mode.
No behaviour change when not in manufacturing mode. - hw/phb3: Update capi initialization sequence
The capi initialization sequence was revised in a circumvention
document when a 'link down' error was converted from fatal to Endpoint
Recoverable. Other, non-capi, register setup was corrected even before
the initial open-source release of skiboot, but a few capi-related
registers were not updated then, so this patch fixes it.
The point is that a link-down error detected by the UTL logic will
lead to an AIB fence, so that the CAPP unit can detect the error.
skiboot-5.1.18
skiboot-5.1.18 was released on Friday 26th August 2016.
skiboot-5.1.18 is the 19th stable release of 5.1, it follows skiboot-5.1.17
(which was released July 21st, 2016).
This release contains a few minor bug fixes.
Changes are:
All platforms:
- opal/hmi: Fix a TOD HMI failure during a race condition.
Rare race condition which meant we wouldn't recover from TOD error - hw/phb3: Update capi initialization sequence
The capi initialization sequence was revised in a circumvention
document when a 'link down' error was converted from fatal to Endpoint
Recoverable. Other, non-capi, register setup was corrected even before
the initial open-source release of skiboot, but a few capi-related
registers were not updated then, so this patch fixes it.
The point is that a link-down error detected by the UTL logic will
lead to an AIB fence, so that the CAPP unit can detect the error.
FSP platforms:
- FSP/ELOG: Fix OPAL generated elog resend logic
- FSP/ELOG: Fix possible event notifier hangs
- FSP/ELOG: Disable event notification if list is not consistent
- FSP/ELOG: Fix OPAL generated elog event notification
- FSP/ELOG: Disable event notification during kexec
skiboot-5.3.1
skiboot-5.3.1
skiboot-5.3.1 was released on Wednesday August 10th, 2016.
This is the 2nd stable release of skiboot 5.3, the new stable release of
skiboot (first released with 5.3.0 on August 2nd, 2016).
Skiboot 5.3.1 replaces skiboot-5.3.0 as the current stable version. It contains
a few minor bug fixes.
This release follows the Skiboot stable rules, see doc/stable-skiboot-rules.txt.
Over skiboot-5.3.0, the following fixes are included:
FSP systems:
- FSP/ELOG: elog_enable flag should be false by default
This issue is one of the corner case, which is related to recent change
went upstream and only observed in the petitboot prompt, where we see
only one error log instead of getting all error log in
/sys/firmware/opal/elog.
NVLink systems (i.e. Garrison):
- npu: reword "error" to indicate it's actually a warning
Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings
about NVLink not working on machines that aren't fully populated with
GPUs. - hmi: Clean up NPU FIR debug messages
With the skiboot log set to debug, the FIR (and related registers) were
logged all in the same message. It was too much for one line, didn't
clarify if the numbers were in hex, and didn't show leading zeroes.
General:
- asm: Fix backtrace for unexpected exception
- correct the log level from PR_ERROR down to PR_INFO for some skiboot
log messages.
skiboot-5.3.0
skiboot-5.3.0
skiboot-5.3.0 was released on Tuesday August 2nd, 2016.
skiboot-5.3.0 is the first stable release of skiboot 5.3, the new stable
release of skiboot, which will take over from the 5.2.x series which was
first released Wednesday March 16th, 2016.
skiboot-5.3.0 contains all bug fixes as of skiboot-5.1.17 and skiboot-5.2.5.
Changes over skiboot-5.3.0-rc2:
- Adopt libtool rules for soname versioning for libflash
See skiboot-5.3.0-rc2 and skiboot-5.3.0-rc1 release notes for a complete
list of changes from skiboot-5.2.0.