CERN Studentship 2016

CERN Studentship 2016 CERN Studentship 2016

Summer Student Report

TMVA Development Report
Feature Description
Restructured MethodBase class Removed static variables
Restructured Factory class Removed static variables and added new constructor that not required global file to save results. (See example below)
New mode ModelPersistence Added new option to the TMVA::Factory to avoid save trained method in xml files.
DataLoader Copy Added new function TMVA::DataLoaderCopy to do a copy of the dataloader also you can to do a copy calling the method MakeCopy. (See example below)
VariableTransform Created a hearder VariableTransform.h and a function CreateVariableTransforms, we can to implement a method in DataLoader(GSoC student).
Restructured TMVA::RootFinder class now it can find the ROOT of the virtual method GetValueForRoot, better design to remove static variable in MethodBase.

Classes serialized to support DataLoader Serialization

DataLoader Serialiation
Classes Required Status
TMVA::MsgLogger DONE
TMVA::DataInputHandler DONE
TMVA::Results DONE NOTE: BUT IS A ABSTRACT BASE CLASS (Needed for the next three classes) - need take a look in * fHistAlias
TMVA::ResultsClassification DONE
TMVA::ResultsMulticlass DONE
TMVA::ResultsRegression DONE
TMVA::DataSet DONE NOTE: Results is ignored in the serialization
TMVA::VariableInfo DONE NOTE: ignored void pointer
TMVA::ClassInfo DONE
TMVA::DataInputHandler DONE
TMVA::DataSetFactory DONE
TMVA::DataSetManager DONE
TMVA::DataSetInfo DONE NOTE: need take a look in *fTargetsForMulticlass
TMVA::OptionBase DONE NOTE: BUT IS A ABSTRACT BASE CLASS (but needed for the next class)

Example code to use TMVA Factory without Global file and without model persistence.

int tmva( )

   TFile *input = TFile::Open( "" );
   std::cout << "--- TMVAClassification       : Using input file: " << input->GetName() << std::endl;
   TTree *signal     = (TTree*)input->Get("TreeS");
   TTree *background = (TTree*)input->Get("TreeB");
   TMVA::Factory *factory = new TMVA::Factory( "TMVAClassification",
   "!V:!ModelPersistence:!Silent:Color:DrawProgressBar:AnalysisType=Classification" );

   TMVA::DataLoader *dataloader=new TMVA::DataLoader("dataset");

   dataloader->AddVariable( "myvar1 := var1+var2", 'F' );
   dataloader->AddVariable( "myvar2 := var1-var2", "Expression 2", "", 'F' );
   dataloader->AddVariable( "var3",                "Variable 3", "units", 'F' );
   dataloader->AddVariable( "var4",                "Variable 4", "units", 'F' );

   dataloader->AddSpectator( "spec1 := var1*2",  "Spectator 1", "units", 'F' );
   dataloader->AddSpectator( "spec2 := var1*3",  "Spectator 2", "units", 'F' );

   Double_t signalWeight     = 1.0;
   Double_t backgroundWeight = 1.0;
   dataloader->AddSignalTree    ( signal,     signalWeight     );
   dataloader->AddBackgroundTree( background, backgroundWeight );
   dataloader->SetBackgroundWeightExpression( "weight" );

   TCut mycuts = ""; // for example: TCut mycuts = "abs(var1)<0.5 && abs(var2-0.5)<1";
   TCut mycutb = ""; // for example: TCut mycutb = "abs(var1)<0.5";

   dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,            "nTrain_Signal=1000:nTrain_Background=1000:SplitMode=Random:NormMode=NumEvents:!V" );

   factory->BookMethod( dataloader, TMVA::Types::kBDT, "BDT",                           "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20" );

   // Train MVAs using the set of training events

   // ---- Evaluate all MVAs using the set of test events

   // ----- Evaluate and compare performance of all configured MVAs

   // --------------------------------------------------------------
   delete factory;
   delete dataloader;

   return 0;