DigiSim
A digitization simulation package
for the International Linear Collider
v01.00
Dhiman Chakraborty, Guilherme Lima,
Vishnu Zutshi
Northern Illinois Center for
Accelerator and Detector Development
Physics Department
Northern Illinois University
Introduction
The purpose of the DigiSim package is to do detector digitization, for
the CALICE test beam as a first goal, and ultimately for the full ILC
detector. It is currently implemented as a standalone module,
readily available for download and use.
The package reads the LCIO files produced by Geant4 applications (like
LCDG4, Mokka or SLIC) and appends the raw hits produced to the output
events. Most of the DigiSim (re)configuration can be done at run time
editing an ASCII steering file, no recompilation is necessary.
The existing
modifiers (digitization classes) are extremely easy to setup and
configure, and new functionality can also be easily added.
DigiSim is thus very powerful, extensible and very simple to use and
extend, it is well suitable for the simulation of CALICE test beam
digitization.
DigiSim was originally developed in C++, and then ported to Java.
Since then, further development has been performed in the Java version,
including modifiers for crosstalk and random noise modeling. The
java version has been reasonably stable since July 2005. We
expect these new developments to be ported back into the original C++
version in the next weeks, and then to keep both versions up to date
with each other.
Design requirements and development
choices
The following requirements were considered when designing the software
package:
- To be initially based on C++ programming language, as most of the
software of the
CALICE collaboration. A Java version has also been developed, as
suggested by the American LC community
- Object-orientated, for easier development and maintenance of the
source code
- Based on the LCIO event model, which is becoming the de-facto standard for CALICE and
ILC simulations
- Used as a test-bed for the development of a digitization
simulation software for the full ILC detector
We chose to use Marlin as the C++ framework on which DigiSim was
developed. The java version of DigiSim has since been developed,
and it currently has more features than the original C++
version. As the java version has been reasonably stable since
July 2005, we expect to have the C++ version synchronized to the java
version, and these two versions should be kept reasonably in
synch.
Package dependencies
The C++ version of DigiSim has the following dependencies:
The Java version of DigiSim is part of the org.lcsim framework
(currently on
version 0.8). The org.lcsim framework itself has the following
dependencies:
Downloading, building and running DigiSim
These instructions are significantly different for the C++ and Java
versions, although the configuration of the DigiSim package itself is
basically identical. The instructions presented below have been
tested within Fedora Linux environments, using g++ version 3.3
or Sun's Java version 1.5. If you try to use DigiSim in
other environments, please tell us about your experience, good or bad.
Instructions for the C++ version
The source code for DigiSim can be checked out from the official
CALICE CVS repository (see access instructions).
Instructions for
building under Linux are given below. These instructions have
been tested within Fedora Linux
environments, using g++ version 3.3, but it will probably build without
problems in other versions of Linux and the g++ compilers as well.
- # download the source code (see access instructions link
above)
- export CVS_RSH=ccvssh
- export CVSROOT=:ext:yourUserName@cvssrv.ifh.de:/calice
- ccvssh login
- cvs co -d digisim calice_sim/digitization/digisim
- # building
- cd digisim
- gmake
- # Running
- ln -sf /path/to/some/data.slcio inputfile
- ./bin/digisim digi.steer
An alternative to the next-to-last step would be to edit the steering
file
digi.steer
, and insert the explicit name(s) of the input data
file(s) to
be digitized.
Several parameters of the digitizer can be configured, as explained
later, by editing the file digi.steer
.
Instructions for the Java version
The java version
of DigiSim is part of the org.lcsim
framework, so the download and build instructions are basically the
ones provided in the LCSim website.
Only a quick summary is presented here:
- # download and install Sun's Java Development Kit 1.5 or later,
see java website for details
- # download and build Maven 1.0.2, see maven website for details.
More recent versions are significantly different than 1.0.2.
- # download GeomConverter
- export CVSROOT=:pserver:anonymous@cvs.freehep.org:/cvs/lcd
- cvs login (use your e-mail as the password)
- cvs checkout GeomConverter
- # build GeomConverter
- cd GeomConverter
- maven jar:install
- cd ..
- # download and build LCSim
- cvs checkout lcsim
- cd lcsim
- maven jar:install
- cd ..
- # building the API documentation using Javadoc
- cd GeomConverter
- maven site
- cd ../lcsim
- maven site
- cd ..
Please note that all maven commands are issued from the top directory
of each package, where the project configuration files project.xml and
project.properties are located. The API documents can then be
consulted pointing your browser to the local file
target/docs/apidocs/index.html.
Running DigiSim / Java
There are two distinct ways of running the java version of DigiSim:
(1) Running in standalone mode, saving the output file with raw hits
and digitized hits for further processing; and
(2) Running DigiSim as a driver, from inside JAS3, as a preprocessor to
your favourite analysis or reconstruction drivers.
Each of these running modes has its own pros and cons. For
instance, JAS3 GUI is very intuitive and friendly, its event browser
and event display features are very helpful to use the drivers as
plugins to build complicated reconstruction chains, but making sure one
is using the right jars and source code is not always obvious to the
uninitiated. Running in standalone mode is more convenient for
running remotely over slow connections, and the user might want more
control over an special environment, by tuning the CLASSPATH of a
single session without changing the overall
setup. Moreover, the standalone steps can be saved in a
script for a faster startup. I personally prefer running long
jobs outside of the graphical environment, and use JAS only to look at
the plots and produce nice figures.
Running DigiSim/Java in Standalone mode
After building the lcsim jar file
following the instructions above, one can run DigiSim standalone by
typing:
- source addjars.sh
~/.maven/repository # once per session, defines the
CLASSPATH to enable the use of the LCSim framework
- ln -sf /path/to/some/data.slcio inputfile
- java org.lcsim.digisim.DigiSimMain
Alternatively to the next-to-last step, one can edit DigiSimMain.java
source code and replace "inputfile" with a specific file name. I
find the use of symbolic links very convenient here. An output
file, digisim.slcio, contains all the raw hits and digitized hits
collections, according to the configuration file used. Note that
by default, DigiSim uses a configuration file based on the detector
name, so that data files based on e.g. SDJan03 geometry will use the
configuration file SDJan03.steer
by default.
Note: Please note that "inputfile" is currently hardwired in the source
code of DigiSimMain class, despite the line "LCIOInputFiles inputfile"
present in the configuration file. That line affects only
the C++ version of DigiSim, not the Java version.
Running DigiSim/Java from inside JAS3
DigiSim can be run from inside
JAS3, using this driver: DigiSimExample.java.
Open this file in JAS3, compile it and load
it. You may want to load your favourite analysis or
reconstruction drivers here as well. Then open an input LCIO
data file to be digitized, and run some events one by one.
You may want to open the LCSim event browser and look at some raw data
(RawCalorimeterHit class) or some digitized data (CalorimeterHit
class). Then rewind the data source and run over all events.
Note: Be sure to select org.lcsim plugin when you open the input data
file. If no dialog box opens at this point, make sure you have
lcsim.jar file loaded, by checking that the LCSim event browser is
available from View menu.
How DigiSim works
The DigiSim package works by using a chain of "modifiers", which will
apply
successive transformations to the input simulated hits. The
resulting raw
hits are then simply appended to the LCEvent, and get automatically
written to the output LCIO file. Fig.1 shows the DigiSim class
diagram, which is helpful to understand how DigiSim works. Note
that the ellipses represent the Marlin/C++ and the LCSim/Java
frameworks. Right below the frameworks are the framework-specific
liaison classes DigiSimProcessor and DigiSimDriver, whose interfaces
are imposed by the frameworks. All other classes have basically
the same interfaces and have the same functionality in the Java and C++
versions.
Figure 1: Class diagram for the
digitization simulation package DigiSim. Please note the
inheritance relationships represented by the solid arrows and the
containment (usage) relationships represented by the solid (dashed)
line and open arrows.
AbstractCalHitModifier, RandomNoise and FunctionModifier are abstract
classes, defining the
interfaces to be followed by their subclasses.
The frameworks, namely Marlin in C++ or LCSim in Java, take care of all
the I/O, and call specific DigiSim hooks for initialization, event
processing and
finalization. The hooks are actually defined by each framework,
hence DigiSimProcessor (DigiSimDriver) is the
only DigiSim class which knows about the Marlin (LCSim) framework, and
so abides by the interface
imposed for all Marlin Processors
(LCSim Drivers).
These classes instantiate one digitizer per subdetector to be digitized.
The Digitizer class is responsible for managing the whole digitization
processing for its subdetector. During its initialization, all
the requested modifiers
are
instantiated and configured, according to the DigiSim configuration
file, or steering file.
The processing which takes place during the event loop is better
understood by analysing Fig.2. The modifiers will act on
transient copies of the
calorimeter hits (class TempCalHit),
which is used as
both input and output to the modifiers' event processing method.
The abstract class CalHitModifier
defines the interface to be inherited by the modifiers.
Figure 2 - Diagram illustrating the
event processing loop. (Click on the figure for better resolution)
At the event loop, events are
passed to the Digitizer,
which extracts the simulated hits (SimCalorimeterHits)
from an LCCollection (SimHitsLCCollection).
Simulated hits are converted into the transient hits (class TempCalHits) and passed through
a chain of modifiers. Each modifier modifies the input TempCalHits by applying their
own transformation. After all the modifiers have been processed,
the final TempCalHit
objects are finally converted into RawCalorimeterHits (including
some double to integer conversions), which are then stored into
a new LCCollection
(RawHitsLCCollection) and appended to the event. The
framework provides default processors/drivers
which take care of writing the modified events into the output file.
Configuring DigiSim and its modifiers
An arbitrary number of modifiers can be defined and used within any
DigiSim run. It is possible to configure and use any number of
modifiers of any single existing modifier type. DigiSimProcessor
is a Marlin processor, therefore it can receive any number of
parameters from the Marlin steering file. The modifiers can then
be configured on-the-fly, using parameters from the steering file (see this simple example).
In the java version, the DigiSim configuration is very similar.
Note that by design, the very same Marlin steering file can also be
used in the Java version as well, and this simplifies the maintenance
of the configuration files in the long term.
Existing modifiers
There are currently a few general modifiers implemented and ready for
use. Many of the existing modifiers implement a smeared linear transformation.
See Fig. 3 for a graphical representation of what we mean with smeared
linear
transformation, but please note that SmearedLinear modifier has been
deprecated, and replaced by simpler modifiers SmearedGain and
SmearedTiming.
Figure 3
- Illustration of the hit
smearing procedure implemented by a typical modifier,
and an explanation of some of the
existing modifiers. (Click on the figure for better resolution)
This is an alphabetical listing of the existing
modifiers, with a brief description:
- AbsValueDiscrimination
Configuration line example:
# Two parameters: (1) threshold,
and (2) width of smearing on threshold
HBdiscrim
AbsValueDiscrimination
8 1
A modifier for basic discrimination on the absolute value of
energy. This means that hits with energies in the range
[-threshold;+threshold] are discarded.
Negative contributions are due to random noise, and large negative
values of random noise may be kept in an attempt to cancel large
positive noise, thus avoiding a positive biasing of
the total average energy deposition due to random noise.
- Crosstalk
Configuration line example:
# Two parameters: (1) mean value of crosstalk to first
neighbors; and (2) width of smearing on the first parameter
HBcrosstalk
Crosstalk
0.020 0.005
This modifier models the light crosstalk on scintillator cells, so it
uses the Segmentation.getNeighbourIDs() method to find what are the
neighbors. Only first neighbors are assigned crosstalk
contributions.
- DeadCell
# Five parameters: cellID components of a specific cell:
# (1) system, (2) barrel/endcap flag, (3) layer
index, (4) theta index, (5) phi index
HBDeadCell
DeadCell
3 0 12
34 56
This modifier always removes any hit for the specified cellID.
Please note that there is no consistency check on the validity of the
cellID provided. If a bad ID is provided, no hit will ever get
removed. One modifier has to be provided for each dead cell.
- ExponentialNoise
# First five parameters from RandomNoise: (1) system, (2)
barrel/endcap flag, (3) noise-only threshold
# (4) nominal time and (5) sigma of time smearing
# One
additional parameter: (6) ยต = mean of the exponential
distribution
HBExpoNoise ExponentialNoise
3 0
7 100
100 0.6
This modifier inherits from RandomNoise, and defines an exponential
noise distribution, with probability=0 for negative amplitudes.
The implementation uses the exponential distribution from Apache's
commons-math library, see http://jakarta.apache.org/commons/math/userguide/distribution.html.
Please read RandomNoise documentation for more details.
- FunctionModifier
(abstract)
An abstract function-based modifier. Its subclasses
must
implement the following abstract function:
virtual
double transformEnergy(const TempCalHit& hit) const = 0;
The values returned from this function will overwrite the ADC counts of
the transient hits.
- GainDiscrimination
# Four parameters: (1) nominal gain, (2) gain
width, (3) nominal threshold, and (4) threshold
width
HBlightCollEff GainDiscrimination
0.0111
0.0029
1 0
A simple modifier inheriting directly from
CalHitModifier. In the example above, a smeared factor of
(0.0111 +/- 0.0029) is applied to each hit independently, and then hits
with the "energy" field below 1 get removed. Please note that the field
is called "energy", but the interpretation may be different, like
"number of photons collected" in the example above. Therefore, a
threshold of 1 means that it does not make sense to have a fraction of
a photon collected. A fixed gain or threshold is
applied if the widths (parameters 2 or 4) are set to
zero.
- GaussianNoise
# First five parameters from RandomNoise: (1) system, (2)
barrel/endcap flag, (3) noise-only threshold
# (4) nominal time and (5) sigma of time smearing
# Two
additional parameters: (6) mean, and (7) width of the
gaussian distribution.
# width<0 means that noise-only
threshold acts on absolute values only, thus both negative and positive
tails are used
HBGaussNoise GaussianNoise
3 0
7 100
100 0.0 -1.6
This modifier inherits from RandomNoise, and defines a gaussian
noise distribution. The
implementation uses the gaussian distribution from Apache's
commons-math library, see http://jakarta.apache.org/commons/math/userguide/distribution.html.
Please read RandomNoise documentation for more details.
- HotCell
# Modifier parameters:
# (1) amplitude mean and (2) sigma of
the smearing on the amplitude around the mean
# (3) timing mean and (4) sigma of the
smearing on the timing around the mean
# (5-9) are the cellID components of a
specific cell: (5) system, (6) barrel/endcap
flag,
# (7) layer index, (8) theta
index, (9) phi index
HBHotCell
HotCell 252525
0 101010
0 3 0 12
123 345
A simplistic modifier for hot cells. While RandomNoise selects
random cell for noise assignment, HotCell picks a specific cell, and
randomly draws the amplitude and timing of the noise to be assigned to
that cell. In the configuration line above, the cell (12,123,345)
will get fixed (non-random, sigma=0) amplitudes and timestamps
according to the values provided above. These values were present
in the final hits. Please note that, similarly to the DeadCell
modifier described above, no check is made on the validity of the last
five parameters. In particular, system=3, barrelEndcapFlag=0
needs to be compatible with an HB (HcalBarrel) cell, and the other cell
indices must also be valid.
- RandomNoise
#
Five common parameters: (1) system, (2) barrel/endcap
flag, (3) noise-only threshold
# (4) nominal time and (5) sigma of time smearing
An abstract modifier,
suggests a common behavior for noise modeling modifiers. Its
subclasses define the specific noise distribution to be used for noise
modeling. The common behavior implemented in RandomNoise modifier
assumes that noise should be added to all cells, but that the noise
assigned to most cells is actually significantly small, and will not
survive discrimination. Considering the huge number of cells in
typical ILC hadron calorimeters, and for performance reasons, the
random noise is added to random cells, in a two-stage processing, as
explained below.
In the first step, the full noise distribution is used to add noise to
all existing hits, as the additional noise may contribute to make some
hits survive discrimination. All hits, therefore, will have some
noise contribution after this first step.
The second step's purpose is to add noise-only cells, when the noise is
large enough to maybe survive discrimination. An operational noise-only threshold is
defined in the configuration line, which is used to calculate the
probability of any cell to receive noise with amplitude larger than the
noise-only threshold. The mean
number (Nmean
) of cells above the noise-only threshold is then
given by [the total number of cells in a given component (say Hcal
barrel)] times [the probability of any cell being above the noise-only
threshold]. Nmean
is calculated during initialization of the
modifier, and then for each event, the number of cells above theshold
(Nabove
) is drawn from a Poisson distribution, with mean Nmean
.
Then a total of Nabove
cells are randomly drawn to receive random
noise. The noise amplitude is then forced to be above the
noise-only threshold, but still following the distribution
provided. CellSelector is the class responsible for randomly
drawing valid cells from a given subdetector.
In order to be usable in the context of the two-steps processing
described above, the RandomNoise subclasses have to define the noise
distribution, while implement the following methods based on that
distribution:
abstract public double drawRandomNoise();
abstract public double getProbabilityAboveThreshold();
abstract public double drawRandomNoiseAboveThreshold();
After reading about the two-steps processing in RandomNoise modifier,
the purpose of these methods should be clear. Method
drawRandomNoise()
must return noise amplitudes according to the full
distribution to be used in the first step, while
drawRandomNoiseAboveThreshold()
returns noise amplitudes forced to be
above the noise-only threshold, to be used in the second step.
Method getProbabilityAboveThreshold()
is used during the initialization
of the modifier, to determine the Nmean
parameter.
This modifier currently needs to know the system and barrel/endcap
flags, which are used to encode the cellID/rawID of the newly added
noise-only cells. For some examples, please take a look at the
ExponentialNoise and GaussianNoise modifiers.
- SiPMSaturation
# Two parameters: (1) nominal gain for the linear regime,
and (2) lower limit of the saturation region
HBSiPMSaturat SiPMSaturation
1 2200
A "quick-and-dirty" implementation of photosensor saturation
effects. This modifier models a linear regime for amplitudes
below the limit, and a constant output for amplitudes above the
limit. Considering the numbers above, a maximum output of 2200
would be provided, for a total number of 2200 PEs or more. This
modifier inherits from FunctionModifier.
- SmearedGain
# Two parameters: (1) nominal gain, and (2)
width of the gain smearing
HBlightYield SmearedGain
10000000 0
The simplest non-trivial function-based modifier.
It models a smeared linear transformation on energy. If the nominal gain is set to 1 without smearing (zero width),
SmearedGain
does not alter the
input values, thus acting like an identity transformation.
- SmearedTiming
# Two parameters: (1) nominal factor on timing, and
(2) width of the smearing on the first parameter
HBtimeSmear
SmearedTiming
1000000 10000
Very similar to SmearedGain modifier,
but it applies the smeared linear transformation on the timing
instead. If the nominal factor on timing is set to 1 without
smearing (zero width),
SmearedTiming
does not alter the
input values, thus acting like an identity transformation on the hit
timing.
Real-life digitization: creating new
modifiers
In order to be properly controlled by the Digitizer, all modifiers
should inherit the interface from the abstract class CalHitModifier,
implementing init(), processEvent(), print()
and newInstance()
methods, see
figure below.
Figure
4 - Creating new modifiers. The member functions in red
are the ones which need to be implemented by the new modifiers. (Click
on the figure for better resolution). Please note that this
figure is somewhat obsolete: transformTime(hit) has been deprecated,
and a current subclass of FunctionModifier is SmearedGain.
In spite of their simplicity, several of the typical effects from the
digitization processes can be represented quite well by appropriate
configuration of one of these very simple-minded modifiers.
Examples are uniform
inefficiencies of say
(97.8+/- 0.5)%, or zero-suppression say below (100+/-2) ADC
counts. Both cases can be represented well by an instance of the
existing GainDiscrimination
modifier. The first one can also be modeled by the simpler
SmearedGain modifier.
At the next step of increasing complexity, one anticipates that some
other effects, like charge saturation or signal integration, can be
realized with
function calls, like:
double
smearedEnergy = transformEnergy(aHit);
Such modifiers can be created very easily, e.g. by
inheriting from FunctionModifier and implementing the transformEnergy() function
shown in
red in Fig.4, or by copying the code
from SmearedTiming
modifier,
changing the class name appropriately and modifying the transformTime() method.
Another set of effects would fall in the next level of complexity, like
cell ganging, random
noise or crosstalk. These effects
typically require
external
information, like cell neighborhood, which is not readily available in
the
hit itself. Geometry-dependent modifiers have been developed to model
crosstalk, exponential and gaussian noise, while keeping
the geometry-dependent processing isolated into reusable geometry-aware
classes like CellSelector.
In general, such modifiers should inherit directly from CalHitModifier, and implement
its virtual methods init(),
processHits(), print() and newInstance(). See
Crosstalk
modifier for a good example of a modifier which depends on
external classes.
Analysis
(java)
Some simple java analysis code is available in the subdirectory java of the C++ version.
It was used to
demonstrate that the modifiers are doing what they are supposed to
do. Please find usage instructions in the README file.
References
- LCIO web page: http://lcio.desy.de
- Marlin web page: http://ilcsoft.desy.de/marlin
- LCSim web page: http://www.lcsim.org
- Java web page: http://java.sun.com
- Maven web page: http://maven.apache.org
- Apache commons-math web page: http://jakarta.apache.org/commons/math