CERN Studentship 2016
Summer Student Report
https://cds.cern.ch/record/2211157
TMVA Development Report | |
Feature | Description |
Restructured MethodBase class | Removed static variables |
Restructured Factory class | Removed static variables and added new constructor that not required global file to save results. (See example below) |
New mode ModelPersistence | Added new option to the TMVA::Factory to avoid save trained method in xml files. |
DataLoader Copy | Added new function TMVA::DataLoaderCopy to do a copy of the dataloader also you can to do a copy calling the method MakeCopy. (See example below) |
VariableTransform | Created a hearder VariableTransform.h and a function CreateVariableTransforms, we can to implement a method in DataLoader(GSoC student). |
Restructured TMVA::RootFinder class | now it can find the ROOT of the virtual method GetValueForRoot, better design to remove static variable in MethodBase. |
Classes serialized to support DataLoader Serialization
DataLoader Serialiation | |
Classes Required | Status |
TMVA::MsgLogger | DONE |
TMVA::DataInputHandler | DONE |
TMVA::Event | DONE |
TMVA::Results | DONE NOTE: BUT IS A ABSTRACT BASE CLASS (Needed for the next three classes) - need take a look in * fHistAlias |
TMVA::ResultsClassification | DONE |
TMVA::ResultsMulticlass | DONE |
TMVA::ResultsRegression | DONE |
TMVA::DataSet | DONE NOTE: Results is ignored in the serialization |
TMVA::TreeInfo | DONE |
TMVA::VariableInfo | DONE NOTE: ignored void pointer |
TMVA::ClassInfo | DONE |
TMVA::DataInputHandler | DONE |
TMVA::DataSetFactory | DONE |
TMVA::DataSetManager | DONE |
TMVA::DataSetInfo | DONE NOTE: need take a look in *fTargetsForMulticlass |
TMVA::OptionBase | DONE NOTE: BUT IS A ABSTRACT BASE CLASS (but needed for the next class) |
TMVA::Option | DONE |
Example code to use TMVA Factory without Global file and without model persistence.
//classification
int tmva( )
{
TMVA::Tools::Instance();
TFile *input = TFile::Open( "http://root.cern.ch/files/tmva_class_example.root" );
std::cout << "--- TMVAClassification : Using input file: " << input->GetName() << std::endl;
TTree *signal = (TTree*)input->Get("TreeS");
TTree *background = (TTree*)input->Get("TreeB");
TMVA::Factory *factory = new TMVA::Factory( "TMVAClassification",
"!V:!ModelPersistence:!Silent:Color:DrawProgressBar:AnalysisType=Classification" );
TMVA::DataLoader *dataloader=new TMVA::DataLoader("dataset");
dataloader->AddVariable( "myvar1 := var1+var2", 'F' );
dataloader->AddVariable( "myvar2 := var1-var2", "Expression 2", "", 'F' );
dataloader->AddVariable( "var3", "Variable 3", "units", 'F' );
dataloader->AddVariable( "var4", "Variable 4", "units", 'F' );
dataloader->AddSpectator( "spec1 := var1*2", "Spectator 1", "units", 'F' );
dataloader->AddSpectator( "spec2 := var1*3", "Spectator 2", "units", 'F' );
Double_t signalWeight = 1.0;
Double_t backgroundWeight = 1.0;
dataloader->AddSignalTree ( signal, signalWeight );
dataloader->AddBackgroundTree( background, backgroundWeight );
dataloader->SetBackgroundWeightExpression( "weight" );
TCut mycuts = ""; // for example: TCut mycuts = "abs(var1)<0.5 && abs(var2-0.5)<1";
TCut mycutb = ""; // for example: TCut mycutb = "abs(var1)<0.5";
dataloader->PrepareTrainingAndTestTree( mycuts, mycutb, "nTrain_Signal=1000:nTrain_Background=1000:SplitMode=Random:NormMode=NumEvents:!V" );
factory->BookMethod( dataloader, TMVA::Types::kBDT, "BDT", "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=GiniIndex:nCuts=20" );
// Train MVAs using the set of training events
factory->TrainAllMethods();
// ---- Evaluate all MVAs using the set of test events
factory->TestAllMethods();
// ----- Evaluate and compare performance of all configured MVAs
factory->EvaluateAllMethods();
// --------------------------------------------------------------
delete factory;
delete dataloader;
return 0;
}