Time

There are two types of time in a simulator like TEMU: simulated real-time (SRT) and wall-clock time (WCT).

SRT

Simulated Real-Time is the simulated time. Simulated time can be frozen by stopping / pausing the simulator. It only advances when the simulator is running.

WCT

Wall-Clock Time is the physical time which progress independent of whether the simulator is paused or not. It thus represents the time as would be shown on a wall-clock.

TEMU SRT is maintained in steps. A step is the execution of one instruction.

This is combined with a conversion parameter and its inverse: Cycles per Instruction (CPI) and Instructions per Cycle (IPC).

This differs form TEMU 2 and 3, where simulated time was maintained in cycles. The changes of time has been introduced for several reasons, primarily to better support super-scalar and multi-core processors. TEMU 2 and 3 modelled individual instruction timing, but this is no longer done in TEMU 4.

Simulated time is provided by a time source, and all models have a designated primary time source.

Time sources may have parent time sources for posting synchronized events.

Time sources are special models that currently cannot be developed by the end user of TEMU, these currently include the following models:

  • Processors

  • Clocks

  • Machine

  • Scheduler

Time Base

TEMU utilizes several notions of time that can be converted between each other.

Diagram

Steps

A step is the execution of an individual instruction. Internally, the processor keeps track of time in steps.

In addition to advancing a step when running an instruction, when the processor is idle, TEMU advances the step counter as much as is needed to trigger the next timed event.

This differs from TEMU 2 and 3, where the step counter in idle mode was advanced by one when advancing the cycle counter to the next event. The change may be noticeable when stepping a processor in idle mode.

It is important to understand, that the TEMU event queues are driven using steps. Hence, the order of execution for events posted e.g. using different cycle or nanosecond offsets, is not defined if they convert to the same step offset.

Cycles

While steps is a notion of how many steps a processor has been executed, cycles is a direct measurement of time. Cycles is related to normal time (nanoseconds, seconds etc) using the time source’s clock frequency.

The relation between steps and cycles is expressed using the cycles per instruction (CPI) and instructions per cycle (IPC) properties.

The CPI and IPC is currently expressed internally as 8:8 fixed point numbers. Though the exposure to the user is via double-properties. By default, TEMU 4 use a 1 cycle per instruction model.

This means that it is possible to adjust the cycles per instructions to any value in the range of [0, 256).

Nanoseconds

Nanoseconds is an integer value in TEMU. That means that sub-nanosecond time intervals cannot be expressed in TEMU nanoseconds.

Sub nanosecond times must be expressed in seconds or cycles (with a higher clock frequency).

Seconds

Seconds in TEMU is a double precision floating point number. It is related to the cycle time using the clock frequency.

Temporal Decoupling

Processors in TEMU run temporally decoupled. That is, each processor maintain and advance its own time individually. In order to avoid running too much ahead of other processors, time is synchronized at regular intervals. This interval is known as the time quantum.

Time Quantum

The Scheduler and Machine are configured using a time quantum. When setting the quanta sizes, one need to consider emulator performance, simulated software behavior and other factors.

In general, a large quantum means that less overhead for managing time is spent. But on the other hand, large quanta may have an effect on the delivery of inter-processor interrupts. This in turn may impact the time of certain software operations.

The Machine model is configured using a quantum in nanoseconds. The Scheduler model is configured using a quantum in steps.

There is no general "best quantum", it has to be experimentally determined. A good rule of thumb and strategy is to start with a quantum of 10000 steps and adjust it up and down for the highest performance using a binary search.