Fault Injection

When testing software, it is often necessary to test error handling in the software. Different devices support different types of error injection capabilities.

Typically, fault injection will need to be done done using a TEMU plugin. In principle, the

This section discuss some approaches and pitfalls with fault injection in TEMU.

Ensuring Determinism

When implementing a custom fault injection module, pay attention to:

  • Ensuring determinism if using random number generation. Random number generator state (seeds, etc) must be possible to snapshot.

  • Implement a fault recorder in the injector that can replay any external fault injections.

Device Faults

Some devices report errors using registers and interrupts. It is possible to modify the register contents, and then raising interrupts using either the command line or the API.

It is often necessary to ensure that multiple registers and interrupts are consistent. This can be challenging, in itself.

Modifying Registers

Injecting faults by manipulating registers is possible, either manually using TEMU commands or using the API.

The main mechanism is that one issues a set operation to a property. Do not execute a write operation, since these may trigger register semantic effects.

Raising Interrupts

Interrupts can be raised using interrupt controller’s commands. Another way is to connect a custom fault injection device to the interrupt controllers interrupt interface.

Memory Faults

Modifying Memory

Modification (corrupting memory contents) can be done by simply writing to the memory.

Note that these writes do not inject ECC errors per see, but simply modifies the content.

Correctable Memory Errors

A correctable memory error is set in the memory space subsystem, using the upset memory attribute.

Some memory controller models support the handling of these attributes natively.

If the memory controller lacks that support, it is possible to attach a custom handler to the memory space’s upsetHandlers interface reference.

Uncorrectable Memory Errors

Uncorrectable memory errors can be injected using the faulty memory attribute. They are seen as uncorrectable in the sense of ECC, that is they get corrected by writing to the location.

It is possible to add permanently uncorrectable errors by connecting a custom memory access interface to the memory space faultyHandlers.

Some memory controller models support the handling of these attributes natively. In some cases, the handling of uncorrectable errors have to be done manually. This is done in the same way as forcing stuck bits by connecting a custom interface to faultyHandlers.

Persistent Memory Errors

Stuck bits can be injected using the preTransaction and postTransaction properties in the memory space.

Another way is to add an intermediate device between a failed device, RAM or ROM and the memory space. This approach is more efficient than using the preTransaction/postTransaction handlers. The reason is that it does not add overhead to other devices.

Diagram

For more complex use cases, nested memory spaces can be used.

Network Interception

Network traffic is a special topic. In principle bus models do not model hardware level integrity checks. For example, parity bits are not modelled in the UART or 1553 models. Only software visible effects are modelled. Several bus models thus have a way of signalling bus model errors such as parity or CRC failures using error bits in frame and packet structures.

The way these features are implemented is specific to the bus model.

Network traffic (including buses such as Ethernet, 1553, SpaceWire and serial), can be handled by tapping the network traffic.

Point to Point Buses

For point to point buses such as serial and SpaceWire, a tap device must be implemented.

Diagram

This device will normally pass through interface invocations in two directions. To drop a bus message, one simply ignore transferring it to the destination. To inject extra messages, one can schedule events that sends pseudo random data.

Multi-Point Buses

For multi-point buses such as CAN, Ethernet and 1553, error injection on the frame or packet level can be done using the TEMU notification mechanism.

Diagram

The notification will typically receive a pointer to the transmitted packet. The send notifications are emitted before the frame is delivered, thus allowing the modification of the traffic in-flight.

Some bus interfaces allow for symbolic injection of certain bus specific errors, e.g. CRC errors, parity errors etc by setting an error bit in the packet structure.

In order to drop packets on multi-point bus, a tap device should be placed between the bus model and the receiving or sending device.