Analysing Execution History

As part of hang detection, the execution history of a running program is analyzed. A critical part is loop detection. Once loops are detected, program execution can be summarized into Run Blocks. These Run Summaries can be further analyzed and summarized into Meta-Run Summaries.

Loop detection

Programs can contain one or more loops. Inside a loop, the same set of instructions is repeatedly executed. During search, whenever a 2LBB program finishes execution of a program block, a loop detection algorithm is performed.

When the program is not yet executing a (detected) loop, the detection algorithm checks if it executed the same sequence of program blocks twice. This can be checked for in lineair time by using the Z-Function. If a repetition is found, the state is updated to reflect that the program is looping.

Example: Assume the program just finished executed the following sequence of program blocks ABACDECDE. At that moment, it detects it is executing a loop, consisting of three program blocks: C, D and E.

When the program is executing a loop, the detection algorithm checks after each program block if it is still inside the loop. The moment the next program block deviates from the expected one, a loop exit is detected.

Example: Assume the program just finished executed the following sequence of program blocks ABCABCAD. Here execution of Program Block D marks the end of the loop.

Run Summary

To facilitate hang detection, the execution of a program summarized into a Run Summary. A Run Summary consists of a sequence of Run Blocks, each consisting of one or more Program Blocks. There are two types of Run Blocks:

  • Loops represent the repeated execution of a sequence of Program Blocks. They are characterized by this repeated sequence and the number of Program Blocks that were executed in this instance.
  • Transitions are sequences that are not part of loop, nor contain any loops. They are characterized by the sequence of Program Block that were executed.

Example: Assume the following sequence of Program Blocks was executed: ABCBABCBCBABADBABABAA. This is then summarized into four Run Blocks:

SequenceType
ABCBATransition
[BC]BCBLoop
ABADTransition
[BA]BABALoop
A?
The trailing program block A is pending classification. It could be the start of either a Transition or a Loop.

Short-loop detection

For recognizing certain more complex hang behaviors it is useful to also detect short loops. These are loops that executed less than two full iterations (and are not detected by Z-function analysis). Instead, potential short-loops are detected by considering the run block that precedes it. When that run block was previously followed by a loop, it is checked if the program blocks that follow it match this loop. If so, these program blocks are considered as a short-running instance of this loop.

Example: Assume the following sequence of Program Blocks was executed: ABCDDDDEFGGGABCDEFGGG. The run summary with and without short-loop detection becomes respectively:

  • ABC [D]DDD EF [G]GG ABCDEF [G]GG
  • ABC [D]DDD EF [G]GG ABC [D] EF [G]GG
With short-loop detection the isolated Program Block D is recognized as a single-iteration loop execution. Doing so better reflects the overall structure (and results in more accurate meta-run summaries).

Meta-run summaries

More advanced programs can contain nested loops. These can be detected using the same loop detection algorithm, but with different input. Instead of using Program Blocks as input, the meta-loop detection algorithm takes Run Blocks as input and groups these into Meta-Run Blocks.

Example: Assume the following sequence of Program Blocks was executed: ABCBACACBCBACACACBCBACACACAC. This is then summarized into the following Run Blocks:

SequenceTypeRun Block
ABCBTransitionP
[AC]ACLoopQ*2.0
BCBTransitionR
[AC]ACACLoopQ*3.0
BCBTransitionR
[AC]ACACACLoopQ*4.0
Here each Run Block is given an identifier. Run Blocks that are the same are assigned the same identifier. For loops to be considered the same their repeating sequence must match. The number of loop iterations may differ. It is indicated by two numbers separated by a period: m.n. Here m is the number of full iterations and n is the number of program blocks executed in the last loop iteration when the loop aborted midway, or zero otherwise.

Ignoring the difference in loop iterations, this results in the following sequence of Run Blocks: PQRQRQ, which can be summarized into two Meta-Run Blocks:

SequenceTypeMeta-Run Block
PTransitionX
[QR]QRQLoopY*2.1
So this program is not trapped in a loop at the lowest level. It executes Loop Q repeatedly, but always exits this loop. At a meta-level, however, it may be trapped in Loop Y. This program may hang.

Identifying larger loops

The algorithm as described eagerly identifies and collapses loops. As soon as a loop is detected, it is replaced by a run block that represents this loop. This is not always desired for meta-run summaries, as there can be a larger, over-arching loop that you would like to detect instead. For example, consider the following run summary:


A*2.1 #B C*3.0 D A*3.1 #B C*5.0 D A*0.2 E
A*3.1 #B C*4.0 D A*4.1 #B C*6.0 D A*0.2 E
A*4.1 #B C*5.0 D A*5.1 #B C*7.0 D A*0.2 E

The repetition of the ABCD run blocks is detected as a loop. This then results in the following meta-run summary:


P*2.1 Q
P*2.1 Q
P*2.1 Q

with: P = ABCD
      Q = E

However, there's a larger loop. It results in the following meta-run summary:


R*3.0

with: R = ABCDABCDE

This summary is preferred, as it shows that at this level the program appears to be stuck in a single loop. Therefore, when creating meta-run summaries, whenever a loop is detected, the algorithm inspects the meta-run summary history. If it is possible to replace multiple run blocks by one over-arching loop run block, it will do so.

Figure 1 shows the program that generates the run summary that was used as an example here. This program gets stuck in a sweep hang. Only once every two sweep iterations does it extend the sweep at the left, which corresponds to Run Block E. The eager loop detection recognizes the repeat of the left and right sweep. This runs for two iterations, after which this extension breaks this loop. Extending the analysis as described lets it detect the larger periodic patterns: Every two full sweeps, the sweep is extended at the left.

Figure 1. Program where eager loop detection fails to detect main loop