Dataset used in Guest, Collado, Baldi, Hsu, Urban, Whiteson “Jet Flavor Classification in High-Energy Physics with Deep Neural Networks” Files: dataset.txt Number of classes: 3 Number of features: variable Number of examples: 11491971 Description: Each line is a data sample. Each sample contains variables related to jet kinematics, tracks, vertex and high level features. Each track contains 20 variables and each sample has a variable number of tracks. Each vertex contains 8 variables and each sample has a variable number of vertices. For each sample there are 14 high level variables. The variables can be subdivided into 3 categories; low level, medium level and high level. Each one of these is constructed strictly using information from the previous level so low level < medium level < high level. The low level variables are the tracks + jet kinematics variables The medium level variables are tracks + vertices + jet kinematics variables The high level are high level features + jet kinematics variables. Label variable: 0: Light Jet (Background) 4: Charm Jet (Background) 5: Bottom Jet (Signal) Feature description Each sample looks like this: jet_pt, jet_eta, flavor, {high level track variables}, {high level vertex variables}, [{{track_variables}, {track_covariance}, {track_weight}, {vertex_variables}}, …, }]. High level variables The high level variables are sub-divided as {{high_level_track_variables}, {high_level_vertex_variables}}. Where the {high_level_tracking_variables} are: track_2_d0_significance, track_3_d0_significance, track_2_z0_significance, track_3_z0_significance, n_tracks_over_d0_threshold, jet_prob, jet_width_eta, jet_width_phi and the {high_level_vertex_variables} are: vertex_significance, n_secondary_vertices, n_secondary_vertex_tracks, delta_r_vertex, vertex_mass, vertex_energy_fraction The track and vertex variables are organized as {{track_variables}, {track_covariance}, {track_weight}, {vertex_variables}} The track variables are given by D0, Z0, PHI, THETA, QOVERP Track covariance is given by: D0D0, Z0D0, Z0Z0, PHID0, PHIZ0, PHIPHI, THETAD0, THETAZ0, THETAPHI, THETATHETA, QOVERPD0, QOVERPZ0, QOVERPPHI, QOVERPTHETA, QOVERPQOVERP The track_weight is related to how strongly the track is associated the corresponding vertex, a higher value means the track is a better fit to the vertex. Vertex variables are taken from the vertex that the track is associated to, and can therefore be identical for several tracks. The variables are as follows: mass, displacement, delta_eta_jet, delta_phi_jet, displacement_significance, n_tracks, energy_fraction More information about the variables and the problem can be found at: https://github.com/dguest/delphes-rave/wiki/Output-Format