

# Realization of a state machine based detection for Track Segments in the Trigger System of the Belle II Experiment\*

# Kai Lukas Unger<sup>†</sup>

Karlsruhe Institute of Technology (KIT) E-mail: kai.unger@kit.edu

## Steffen Bähr

Karlsruhe Institute of Technology (KIT)

# Jürgen Becker

Karlsruhe Institute of Technology (KIT)

# Yoshihito Iwasaki

High Energy Accelerator Research Organization (KEK)

# KyungTae Kim

Korea University (KU)

# Yun-Tsung Lai

High Energy Accelerator Research Organization (KEK)

The Belle II experiment relies on an online level 1 trigger system to reduce the background and achieve the targeted frequency of 30 kHz. Here the basis for all trigger decisions based on data from the Central Drift Camber is the track segment finding. To improve both efficiency and maintainability we restructured the original combinatorial approach to finite state machines. The new implementation is saving about 20% of FPGA slices. To achieve high test coverage an automated test framework was developed for design time validation. Operational correctness is achieved by integration in cosmic ray tests.

Topical Workshop on Electronics for Particle Physics TWEPP2019 2-6 September 2019 Santiago de Compostela - Spain

© Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).

<sup>\*</sup>Sponsored by the German Federal Ministry of Education and Research †Speaker.

## 1. Introduction

The Belle II experiment aims at detecting B mesons decays caused by collisions of electrons (7 GeV) with positrons (4 GeV). A record luminosity of  $\mathcal{L} = 8 * 10^{(35)} cm^{-2} s^{-1}$  is expected [1]. The event rate reaches 130 kHz which is too high to record all the data. Therefore, the trigger system reduces the event rate to 30 kHz and ensures that only relevant data will be stored [5]. The Central Drift Chamber (CDC) is the main detector for track reconstruction. The L1 trigger chain of the CDC can be seen in Figure 1. The basis of all processing are represented by the Track Segment Finder (TSF). Afterwards the reduced data is passed on to the 2D Finder and the 3D Finder, which estimate the track parameters from the data. Furthermore, the Event Time Finder gets the collision time and the Z-Vertex Track Trigger [3] receive the information to estimate the origin and angle of the collision. All this informations are gathered in the Global Decision Logic, which evaluates whether the data is relevant or not [4].



Figure 1: Overview over the L1 Trigger Chain for the Central Drift Chamber

#### 1.1 Track Segment Finding

The TSF identifies Track Segments in the data of the CDC. To achieve this, five layers of wires are combined to a Super Layer and compared with predefined geometric shapes, called track segments. In Super Layer 0 the track segments are organized in a pyramid-like shape. Track segments in Super Layer 1 to 8 have houerglass-like shape. This arrangement is shown in Figure 2 and intended limit the possible origin angle to around  $30^\circ$  of the collision origin.



**Figure 2:** Shape of the Track Segments (red). The dark green line is the First Priority and the brighter green is the Second Priority



In each track segment, there are three special cells. Each cell represents a priority status, either first or secondary left/right L/R priority. All first priority cells across one Super Layer are assigned with a unique ID that is used to identify one specific segment. The Track Segment ID contains the number of the First Priority and the Super Layer. If there is only a Second Priority Hit, in total four layers with hits in a time window of  $\sim$  500ns are required for an output. In addition the L/R solution and the hitposition are added to the output as well. The intention behind the L/R solution is shown

Kai Lukas Unger

in Figure 3. It gives the passage of a particle to either the left, right or an unknown direction. This passage is calculated using a predefined lookup table, since online computation is too complex and exceeds the available latency budget. This lookup table is implemented using the FPGA's BRAM.

# 2. Concept

### 2.1 State Machine





**Figure 4:** States and conditions of the TSF State machine

**Figure 5:** Possible Track Passage and the two associated output Track Segments

The State Machine is implemented as a Mealy machine. Figure 4 gives an overview of its states in which events in CDC data are detected. The FSM starts in the "wait" state. After a First Priority Hit occurs, the output of the Track Segment is send and the state transitions to the next state "first", where a conter lasting for  $\sim$  500ns started for the FSM. During this time, any new hits are suppressed. This delay is required for the ionized gas returning to its resting state. Any additional hit within this time window is assumed to belong to the same track. When the counter get to zero, the state changes back to "wait" and the logic back to observation of the CDC. If there is a Second Priority Hit the state changes to "second" and set the counter for  $\sim$  500ns. If during this period a First Priority Hit occurs, the FSM transitions from a Second Priory Hit ("fist"), which works as described above. If this happens, the new Track Segment outputed out and the counter is new set to  $\sim$  500ns.

### 2.2 Neighbor Hit Suppression

Since only 15 Track Segments can be transmitted to the following Trigger Systems once each 31.75 MHz clock cycle, it is important to avoid redundant information. The Neighbor Hit Suppression algorithm allows a data reduction by not transmitting Second Priority hits when a First Priority Hit has occurred in a neighboring Track Segment.

Figure 5 shows a possible double output of a First Prioriy Hit and a Second Priority Hit resulting from the same particle track. In the particle trajectory shown in the example, the Second Priority Hit from cell 37 generates an additional output to the First Priority Hit from cell 38. The Neighbor Hit Suppression algorithm suppressed any Second Priority Hit from a neighboring cell for  $\sim 125$ ns time. Only the First Priority Hit will be transmitted. The Neighbor Hit Suppression algorithm is implemented as a State Machine with two distinct states ("no\_neighbor\_hit" and "neighbor\_hit"), where the "no\_neigbor\_hit" is the initial state. If a first priority hit in a neighbor Track Segment occurs, the state changes from "no\_neighbor\_hit" to "neighbor\_hit" and a counter of 125ns is set,

counting down to zero. While there is no Second Priority Hit allowed within the 125ns window, if ther is a First Priority Hit, it will generate a First Priority output. When the counter is zero the state changes back to the initial state "no\_neighbor\_hit".

#### 2.3 Dual Port BRAM use

The Xilinx Virtex 6 FPGA used in the experiment has Dual Ports Block RAMS (BRAMS). The second read port can also be used to resolve the L/R result determination, as it is a read-only lookup table. The problem here, is that for each Super Layer the L/R result determination is the same. In the case where faulty wire is present within a Track Segment a new lookup table has to be used. To work around this problem, a configuration has been introduced that assigns the appropriate BRAM to each track segment. By using Dual Port feature, the BRAM resource consumption is reduced, and the bad wire does not impact the L/R ambiguity decision.

#### 3. Evaluation

## 3.1 FPGA Resources

The state machine implementation in the Universal Trigger Board 3's Xilinx XC6VHX565T FPGA is integral to the Belle II experiment. The highest resource consumption of the Xilinx comes from the most outer Super Layer, layer 8, because it processes more data than the inner Super Layers, with a 24 \* 256 bit parallel input every 31.75ns. Table 1 shows the resource consumption of the old implementation compared to the new state machine version. The use of Slice Registers and LUTs is reduced by 20% over all Super Layers. The lookup tables and flip-flops (LUT-FF) are about 30% reduced over all Super Layers. The bigger LUT-FF reduced at Super Layer 8 are coming from special adjustment for this Super Layer. By using both ports of the dual port, the occupancy of the BRAMs B36E1 has decreased. The BRAM resource usage to determine the L/R ambiguity is reduced by 50%. The overall reducetion is less than 50% because some BRAMs are used in other Parts of the TSF. This reduction is important for the final integration, as BRAM buffers are limited.

| 6V. | HX565T FPGA        |                |            |        |      |
|-----|--------------------|----------------|------------|--------|------|
|     |                    | Slice Register | Slice LUTs | LUT-FF | BRam |
|     | old Implementation | 295116         | 223432     | 163883 | 430  |
|     | state machine      | 214788         | 174714     | 69198  | 252  |

 Table 1: Super Layer 8 resource usage from the old implementation and the new state machine of the Xilinx

 XC6VHX565T FPGA

#### 3.2 Validation

The implementation was validated across all Super Layers in simulation and integrated in the experiment's trigger system. The integrated system is validated with both cosmics and collisions. In all cases, it shows the expected behavior achieving high efficiency in finding both First and Second Priority Hits. According to initial preliminary analyses, the efficiency is close to the theoretical potential of the system. Analyses that are more precise are still in progress.

#### 4. Summary

The upgraded SuperKEKB collider is designed to achieve a luminosity of  $\mathscr{L} = 8 * 10^{(35)} cm^{-2} s^{-1}$ . As a result higher machine background is expected in the attached Belle II experiment compared to its predecessor Belle. The estimated event rates exceed the targeted trigger rate of 30 kHz. One of the major trigger systems employed to achieve this rate is the trigger system of the central drift chamber at the Level 1. Here the basis for all present trigger algorithms is the Track Segment Finder. Its task is to detect regions of active wires in the detector and combine them into track segments, arrangements of neighboring wires that conform to a predefined geometrical shape. The track segment finder is implemented on XC6VHX565T Virtex 6 FPGAs and integrated close to the detector readout. It has to be efficient and operate in low latency to fullfill the requirements of the entire Level 1 trigger system. A combinatorial approach was in operation for the first phases of the experiment. To increase efficiency and maintainability we present a restructured implementation based on finite state machines. The advantages of the new state machine implementation are the easier expandability as well as the lower resource consumption. By reducing resources, we are able to ease the achievement of timing closure for further iterations of the firmware. Additionally we added a mechanism to reduce transmission of redundant data and thus improve operation by employing suppression of neighboring hits. The overall design is reworked to provide a higher degree of flexibility during design time. We incorporated an approach to adjust track segment finding for predefined regions of the central drift chamber by allowing partial loading of special track finding configurations. This will allow easy adjustment for geometrical regions with broken wires. As this finding mechanism is based on BRAM, we developed an automated dual and single port inference to keep resource consumption to a minimum. On average the new implementation is capable of achieving a reduction of used Slice LUTs of about 20%. To validate the new implementation, we developed a test framework to support target high test coverage or dedicated detector data patterns to investigate special cases. The trigger system is validated using cosmic rays.

## References

- Z. Doležal and S. Uno, Bell II Collaboration "Belle II Technical Design Report" KEK Report 2010, October 2010. Available: https://arxiv.org/abs/1011.0352
- [2] N. Taniguchi et al. "All-in-one readout electronics for the Belle-II Central Drift Chamber" : doi.org/10.1016/j.nima.2013.06.096
- [3] S. Baehr et al. "A neural network on FPGAs for the z-vertex track trigger in Belle II" 2017. Journal of Instrumentation, 12, Art. Nr.: C03065. doi:10.1088/1748-0221/12/03/C03065
- [4] Y. Iwasaki et al., "Level 1 Trigger System for the Belle II Experiment," in IEEE Transactions on Nuclear Science, vol. 58, no. 4, pp. 1807-1815, Aug. 2011. doi: 10.1109/TNS.2011.2119329
- [5] E. Kou et al.. The Belle II Physics Book. 2018. (hal-01902963)