Higgs to W electrons studies

Saturday, September 29, 2007

The recipe vanilla (160)

This is the recipe to run H->WW analysis in CMSSW 160

Test Release based on: CMSSW_1_6_0
Base Release in: /afs/cern.ch/cms/sw/slc4_ia32_gcc345/cms/cmssw/CMSSW_1_6_0
Your Test release in: /afs/cern.ch/user/e/emanuele/scratch0/higgs160
--- Tag --- -------- Package --------
V00-01-01 CondCore/EgammaPlugins
V02-06-02 CondFormats/DataRecord
V00-01-00 CondFormats/EgammaObjects
V00-05-00 DataFormats/EgammaReco
V00-00-04 EgammaAnalysis/ElectronIDAlgos
V00-00-01 EgammaAnalysis/ElectronIDESSources
NoTag HiggsAnalysis/HiggsToWW2e
shr-8aug07 PhysicsTools/StatPatternRecognition
V00-05-07 RecoEcal/EgammaClusterProducers
V00-03-00 RecoEcal/EgammaCoreTools
---------------------------------------
total packages: 10

Since the SQLite DB file is only available in releases >170_pre5, use a local file. Change, in EgammaAnalysis/ElectronIDESSources/data/likelihoodPdfsDB.cfi

replace CondDBCommon.connect = "sqlite_file:/afs/cern.ch/user/e/emanuele/scratch0/vanilla131HLT6/src/EgammaAnalysis/ElectronIDAlgos/test/electronlikelihood.db"   
replace CondDBCommon.catalog = "file:/afs/cern.ch/user/e/emanuele/scratch0/vanilla131HLT6/src/EgammaAnalysis/ElectronIDAlgos/test/mycatalog.xml"

with:

replace CondDBCommon.connect = "sqlite_file:/afs/cern.ch/user/e/emanuele/public/electronLikelihoodDb160/electronIdLikelihood.db"   
replace CondDBCommon.catalog = "file:/afs/cern.ch/user/e/emanuele/public/electronLikelihoodDb160/likelihoodcatalog.xml"

refer to the previous post to run in batch and with CRAB.

emanuele

Thursday, August 9, 2007

The recipe vanilla (131HLT6)

# to get it working in 131_HLT6 #
scramv1 p -n vanilla131_HLT6 CMSSW CMSSW_1_3_1_HLT6
cd vanilla131_HLT6/src
eval `scramv1 ru -csh`
project CMSSW

# the code: checkout the following packages:
--- Tag --- -------- Package --------
V00-01-00 CondCore/EgammaPlugins
V00-01-10 CondFormats/BTauObjects
V02-06-02 CondFormats/DataRecord
V00-01-00 CondFormats/EgammaObjects
V00-05-00 DataFormats/EgammaReco
V00-00-04 EgammaAnalysis/ElectronIDAlgos
V00-00-01 EgammaAnalysis/ElectronIDESSources
V00-00-01 HiggsAnalysis/HiggsToWW2e
shr-8aug07 PhysicsTools/StatPatternRecognition
V00-03-04 RecoEcal/EgammaClusterAlgos
V00-05-07 RecoEcal/EgammaClusterProducers
V00-03-00 RecoEcal/EgammaCoreTools
V00-05-06 SimCalorimetry/CaloSimAlgos
---------------------------------------
cvs co -r V00-05-01 RecoEcal/EgammaClusterProducers/data/geometryForClustering.cff

# Claude's bug fixes for 131:
cp /afs/cern.ch/user/c/charlot/scratch0/CMSSW_1_3_1_bugfix/src/Egamma131Fixes.tar.gz .
tar xzvf Egamma131Fixes.tar.gz

# Chiara's fix for NAN in particle time of flight
addpkg SimCalorimetry/CaloSimAlgos
edit SimCalorimetry/CaloSimAlgos/src/CaloHitResponse.cc
and add [see the head, in case] in 'run'
// check the hit time makes sense
if ( isnan(((*hitItr).time())) ) { continue; }

# building
scramv1 b

# running interactively
cd HiggsAnalysis/HiggsToWW2e/test/
cmsRun HWW2eAnalysis.cfg

# running in batch
HiggsAnalysis/HiggsToWW2e/scripts/run.pl -h

# running with CRAB
see howto here:
http://welectrons.blogspot.com/2007/06/per-usare-crab.html

Saturday, August 4, 2007

The strange case of crash in HLTX

Hi all,

the old recipes for set-up the analysis code failed at runtime for a segmentation fault
in 131_HLTX releases because of the different version of DataFormats and RecoEcal/EgammaCoreTools used in 131.
Now I have updated them.
The new recipe for 131_HLTX releases, including the bug fixes by Claude for electrons, is:

# set up the release
scramv1 project -n HtoWW131_HLT6 CMSSW CMSSW_1_3_1_HLT6
cd HtoWW131_HLT6/src
project CMSSW

# download the core code
cvs co HiggsAnalysis/HiggsToWW2e

# fix for NAN in particle time of flight
cvs co -r CMSSW_1_3_1_HLT6 SimCalorimetry/CaloSimAlgos

edit SimCalorimetry/CaloSimAlgos/src/CaloHitResponse.cc
and add [see the head, in case] in 'run'

// check the hit time makes sense
if ( isnan(((*hitItr).time())) ) { continue; }

# to use new variables for shower shape:
source cvs_setup_pietro
cvs co -r edm-HLT6 DataFormats/EgammaReco
cvs co -r edm-HLT6 RecoEcal/EgammaCoreTools
cvs co -r edm-HLT6 PhysicsTools

# this is necessary in HLTX releases
project CMSSW
cvs co -r CMSSW_1_3_1_HLT6 RecoEcal/EgammaClusterProducers

# and finally download the likelihood code:
source cvs_setup_pietro
cvs co HtoWWElectrons/HtoWWLHAlgo
cvs co HtoWWElectrons/HtoWWLHESSource
cvs co HtoWWElectrons/HtoWWLHRecord

1) edit HiggsAnalysis/HiggsToWW2e/src/CmsEleIDTreeFiller.cc
uncomment
// #include "HtoWWElectrons/HtoWWLHRecord/interface/PidLHElectronRecord.h"
// #include "HtoWWElectrons/HtoWWLHAlgo/interface/PidLHElectronAlgo.h"
// #include "HtoWWElectrons/HtoWWLHESSource/interface/PidLHElectronESSource.h"
and uncomment
// edm::ESHandle likelihood;
// iSetup.getData( likelihood );
// privateData_->eleLik->push_back(likelihood->getLHRatio (electron,iEvent,iSetup) );
and comment
privateData_->eleLik->push_back( -1. );

2) edit HiggsAnalysis/HiggsToWW2e/src/CmsTreeFiller.cc
and uncomment
//privateData_->lat->push_back(sClShape->lat());
and comment
privateData_->lat->push_back(-1.);

uncomment
// privateData_->a00->push_back(sClShape->zernike00());
// privateData_->a11->push_back(sClShape->zernike11());
// privateData_->a20->push_back(sClShape->zernike20());
// privateData_->a22->push_back(sClShape->zernike22());
// privateData_->a42->push_back(sClShape->zernike42());
and comment
privateData_->a00->push_back(-1.);
privateData_->a11->push_back(-1.);
privateData_->a20->push_back(-1.);
privateData_->a22->push_back(-1.);
privateData_->a42->push_back(-1.);

3) edit HiggsAnalysis/HiggsToWW2e/test/HWW2eAnalysis.cfg
and uncomment
# include "HtoWWElectrons/HtoWWLHESSource/data/configuration.cfi"

4) add the libs of the likelihood in the HiggsAnalysis/HiggsToWW2e/BuildFile:
cp ~emanuele/public/BuildFile HiggsAnalysis/HiggsToWW2e/BuildFile

# other bug fixes in 131 (by Claude)
cp /afs/cern.ch/user/c/charlot/scratch0/CMSSW_1_3_1_bugfix/src/Egamma131Fixes.tar.gz .
tar xzvf Egamma131Fixes.tar.gz

Sunday, July 22, 2007

LAT continue

Saturday, July 21, 2007

etaLAT test

Dear All,

after running on a sample of electrons at pt 35 as signal and QCD samples (pt_20-30 + pt_30-50, weighthed by the cross sections), I tried and calculate the best selection possible on the normalized distributions, for the lateral moment LAT and for the eta and phi lateral moments.
Results are summarized as follows:

variable	selection	signal/sqrt(bkg)	eff signal	eff bkg
LAT	0.285	1.38391	0.673745	0.237015
eta LAT	0.145	1.52133	0.759894	0.249493
phi LAT	0.185	1.14436	0.63236	0.305356

official CMSSW repository filling

Dear All,

Emanuele and me committed part of th analysis code in the CMSSW HiggsAnalysis/HiggsToWW2e repository.
The likelihood implementation is missing, since it is going to be integrated in the EgammaAnalysis package, nevertheless the dumper contains already some commented lines waiting for the likelihood information.
The code is compiling and has been tested in the CMSSW_1_3_1_HLT6 release and up to now is just staying in the HEAD of the repository.
The need of a custom ObjectSelector has been temporarly solved by adding it to the plugins folder.

Emanuele and Pietro

Wednesday, July 11, 2007

bug fix for hits NAN: correction

Ciao,
in the post with the list of things to do to setup the machinery I wrote

# fix for NAN in particle time of flight
cvs co -r CMSSW_1_3_1 SimCalorimetry/CaloSimAlgos
edit SimCalorimetry/CaloSimAlgos/src/CaloHitResponse.cc
and add [see the head, in case] in 'run'
// check the hit time makes sense
if ( isnan(((*hitItr).time())) ) { continue; }

this is OK, but the head version CAN not be used, otherwise there are problems with the calo digitization. The checkout has to be done from HLT5, the head can only be taken as a reference to know where the fix has to be put.

Chiara

Tuesday, July 10, 2007

Tag and Probe with Z/W as a control sample for likelihood

Hi Carole and Ozana,

this is just a brief message to outline the tag and probe strategy and its specialized usage for the electron identification stuff.

We have developed an algorithm which, given an object reconstructed as an "electron" in the CMS detect, returns the probability for it to be a real electron or a fake (typically pions or kaons inside a QCD jet).

This algorithm has been tuned and studied on MC samples of "pure" electrons and "pure" QCD jets used as background.
It is necessary to setup a working strategy to control the setup and the performances of this algorithm with the data (when we will have them). So you need to find what is called a "control sample", i.e. a sample on data which is made of electrons (for signal category) or pions (fo background category) with high purity where check that the output of your algorithm is what you expect from Monte Carlo simulation.

While it is quite easy to have control sample for QCD jets (you have a lot of them from minimum bias events which contains a negligible fraction of leptons), more difficult is find a pure sample of electrons.
A method has been studied, called tag & probe.
It is based o the fact that you will produce a lot of Z bosons which decay in e+e-. This will produce a lot of electrons also in the early stage of the experiment.
Since you have two electrons, you define your signal (an electron) as follows:

a) you look for a well reconstructed electrons which satisfy a number of quality requests.
You define this "electron" as "tag".
b) then you look simply for a cluster in the electromagnetic calorimeter and this is your
"probe" electron.
You don't require any electron identification request on this because you want to test the likelihood algorithm on this.
The only request you do on this is uncorrelated with the "electron properties" of the e.m. cluster: i.e. you require that combining it with the "tag" electron its invariant mass is consistent with Z mass.
This should reduce a lot the background.

What you could do is to apply the recently developed likelihood algorithm on the "probe" candidates and look at the performances of the algorithm.
This is a fundamental test because it is a test you do on data in order not to rely on Monte Carlo simulation which could not reproduce perfectly the data.

There are people already working on it and they defined the criteria to select the "tag" and "probe" electron candidates.
Take a look to this presentation:

http://indico.cern.ch/getFile.py/access?contribId=5&resId=1&materialId=slides&confId=12396

you could try to reproduce their selections to define the "tag" & "probe" samples.

Then a further step could be this.
After having selected the "probe" samples, you have a hopely very high purity. But still there will be background events which smear the distributions of the discriminating variables which are the inputs of the likelihood algorithm.

One idea could be to do a background subtraction based on the di-electron invariant mass.
This is a statistical technique, and could be done in different ways. A smart way of doing it could be to fit the di-electron invariant mass with a model for the signal and a model for the combinatorial background. This will provide a event-by-event probability for the event to be "signal". You could then weight the event with the likelihood to be signal and this will provide
automatically a "background-subtracted" distribution for the discriminating variables.
A reference for this can be this:

http://arxiv.org/abs/physics/0402083

One target for us can be to set up an automatic tool to do this. This is complicated by the fact that
we have to do things for the different classes, for barrel and endcap, for different pt's, etc.
For spin reasons, for example, you can find very few events of Z decays at eta~0 and low Pt's,
so there are a number of challenges to face.
You could enjoy them ;)

We will talk about this when we are all together at CERN (next days).
Ciao!

emanuele

ElectronID e Likelihood

una preliminare versione dell'algoritmo di likelihood coerente con il framework di electron ID sta nella mia public folder, la parte algoritmica si chiama EgammaAnalysis_V7.tar.gz, mentre i plugin che li utilizzano stanno in HtoWWElectrons_V7.tar.gz, sempre nella mia public.
Con Emanuele abbiamo inserito il trait necessario alla lettura del ESSource direttamente con l'ObjectSelector.

A preliminary version of the likelihood algo is in my public folder

~govoni/public/EgammaAnalysis_V7.tar.gz

used in

~govoni/public/toWWElectrons_V7.tar.gz

Monday, July 9, 2007

How to set up the analysis machinery

Hi all!

with Emanuele, we tried to summarize the steps which are needed to setup the machinery for the analysis. Here is a (hopefully) working recipe:

scramv1 project CMSSW CMSSW_1_3_1_HLT5
cd CMSSW_1_3_1_HLT5/src
eval `scramv1 ru -csh`
project CMSSW

# extra tags for the trigger
cvs co -r V00-01-44 HLTrigger/Configuration
cvs co -r V00-00-20-09 L1Trigger/RegionalCaloTrigger
cvs co -r V01-00-14 L1Trigger/L1ExtraFromDigis
cvs co -r V00-00-38 RecoEgamma/EgammaHLTProducers
cvs co -r V00-01-10 Utilities/ReleaseScripts
cvs co -r V00-05-02-02 RecoMuon/L2MuonProducer
cvs co -r V00-00-53 HLTrigger/Egamma
cvs co -r V00-01-50 HLTrigger/Muon
cvs co -r V00-00-87 HLTrigger/btau
cvs co -r V00-00-49 HLTrigger/xchannel
cvs co -r V00-00-07-18 HLTrigger/JetMET
cvs co -r V01-03-26 HLTrigger/HLTcore
cvs co -r V00-01-44 HLTrigger/Configuration
cvs co -r V04-01-00-01 CalibTracker/SiStripConnectivity
cvs co -r V01-00-00-02 CommonTools/SiStripClusterization
cvs co -r V03-04-02 DataFormats/SiStripCluster
cvs co -r V03-05-02-01 DataFormats/SiStripCommon
cvs co -r V01-02-05-00 DataFormats/TrackerRecHit2D
cvs co -r V02-00-00-05 EventFilter/SiStripRawToDigi
cvs co -r V01-04-04-01 RecoLocalTracker/SiStripRecHitConverter
cvs co -r V05-00-40-01 RecoTracker/MeasurementDet
cvs co -r V01-02-01-00 RecoTracker/TransientTrackingRecHit

# bug fix in 131 for electron reconstruction:
cvs co -r CMSSW_1_3_1 DataFormats/EgammaCandidates
edit DataFormats/EgammaCandidates/src/PixelMatchGsfElectron.cc
and replace line 196 hadOverEm_*=newEnergy/superClusterEnergy_;
with hadOverEm_*=superClusterEnergy_/newEnergy

# fix for NAN in particle time of flight
cvs co -r CMSSW_1_3_1 SimCalorimetry/CaloSimAlgos

edit SimCalorimetry/CaloSimAlgos/src/CaloHitResponse.cc
and add [see the head, in case] in 'run'

// check the hit time makes sense
if ( isnan(((*hitItr).time())) ) { continue; }

# to use new variables for shower shape:
source cvs_setup_pietro
cvs co -r V01-01 DataFormats/EgammaReco
cvs co -r V01-01 RecoEcal/EgammaCoreTools
cvs co -r V131_HLT5 PhysicsTools

# and finally download the code:
cvs co HtoWWElectrons

Ciao!
Chiara