Using the High Level Trigger to compress TPC data

Thorsten Kolleger

The ALICE detectors can read out data at much higher rate than what can be handled by the data acquisition and storage systems. The High Level Trigger (HLT) has been designed to make use of this gap in data bandwidth and thereby maximizing the physics output of ALICE.

This goal can be achieved with two different operating modes: triggering and compression. When run as a trigger, the HLT selects “interesting” events and discards the others. To identify these events, the HLT performs a fast online reconstruction of the entire events, including full reconstruction of the TPC data. In this mode, the HLT enhances the statistics only for certain physics signals which have a trigger implemented.


Event display of one of the first captured heavy ion collision at ALICE reconstructed by the HLT.

However the online reconstruction results can also be used to enhance the statistics of the full data set, which is the second operation mode of the HLT called compression. In this mode, some selected results of the HLT online reconstruction are saved to tape, while some of the detector raw data is not stored anymore. The HLT results are then used as input to the normal offline reconstruction of the events.

For the 2011 Heavy-Ion run, it is planned to operate the HLT for the first time in the compression mode. The focus for this year has been the TPC, which is by far the largest data source in the ALICE system with up to 80 MByte for a central Pb+Pb event. The first step of the TPC reconstruction is calculating cluster (or hit) positions from the charge deposited by the tracks. The HLT calculates these clusters using the FPGAs on the Read-Out Receiver Cards (H-RORC). The data volume of these clusters is 30% smaller than the original raw data from which they were calculated. By saving the HLT clusters instead of the raw data, one can thus increase the event statistics on tape by a factor ~1.5.

An even further reduction of the data volume is possible by further optimizing the data format, e.g. by reducing the precision of the variables to match the detector precision. Even more can be achieved by applying Huffman coding on the data, which is an entropy encoding algorithm for lossless data compression. This lossless data compression works best if the entropy of the individual cluster variable distributions is small. The HLT compression algorithm thus tries to optimize the individual cluster variables for this by using additional information. The most advanced feature implemented is to store not the direct cluster positions, but rather the distance to the closest TPC track. Since most clusters are assigned to a track, this distribution is peaked at small distances and thus has a small entropy compared to the clusters positions which are evenly distributed. Combining all these steps, a total data reduction of a factor 4-5 compared to the raw data is possible. Which means a factor 4-5 more events for physics analysis.

To handle the expected data rate of the TPC, the HLT has been upgraded significantly during the September Technical Stop. More Graphics Proccessing Units (GPUs), which are used for the track finding in the TPC, have been added to the computer cluster and are currently under commissioning. With these improvements, the HLT is able to handle up to 200 Hz most central Pb+Pb events. Also the stability of the HLT system has been worked on, which is now >95% for the last few months. In this time also the performance of the cluster finding and compression algorithm has been significantly improved and the offline reconstruction software adopted to use the HLT clusters instead of the raw data.

The HLT will be operating in the compression mode in the upcoming 2011 Heavy-Ion run, reducing the data rate by a factor 4-5 and thus increasing the event statistics for all physics analysis significantly.

Alice Matters