Search for the rare decay B 0 → J/ψϕ

A search for the rare decay is performed using collision data collected with the LHCb dete-ctor at centre-of-mass energies of 7, 8 and 13 TeV, corresponding to an integrated luminosity of 9 fb−1. No significant signal of the decay is observed and an upper limit of at 90% confidence level is set on the branching fraction.

The decay was first observed by the LHCb experiment with a branching fraction of [1]. It proceeds primarily through theb  (2021) 1) The inclusion of charge-conjugate processes is implied throughout this paper. 043001-5 from the meson as an intermediate state. The decay is suppressed by the Okubo-Zweig-Iizuka (OZI) rule that forbids disconnected quark diagrams [2][3][4]. The size of this contribution and the exact mechanism to produce the meson in this process are of particular theoretical interest [5][6][7]. Under the assumption that the dominant contribution is via a small component in the wave-function, arising from mixing ( Fig. 1(a)), the branching fraction of the decay is predicted to be of the order of [5]. Contributions to decays from the OZI-suppressed tri-gluon fusion ( Fig. 1(b)), photoproduction and final-state rescattering are estimated to be at least one order of magnitude lower [7]. Experimental studies of the decay could provide important information about the dynamics of OZI-suppressed decays.
No significant signal of decay has been observed in previous searches by several experiments. Upper limits on the branching fraction of the decay have been set by BaBar [8], Belle [9] and LHCb [1]. The LH-Cb limit was obtained using a data sample corresponding to an integrated luminosity of 1 of collision data, collected at a centre-of-mass energy of 7 . This paper presents an update on the search for decays using a data sample corresponding to an integrated luminosity of 9 , including 3 collected at 7 and 8 , denoted as Run 1, and 6 collected at 13 , denoted as Run 2.
The LHCb measurement in Ref. [1] is obtained from an amplitude analysis of decays over a wide range from the mass threshold to 2200 . This paper focuses on the region, with the mass in the range 1000 -1050 , and on studies of the and mass distributions, to distinguish the signal from the nonresonant decay and background contaminations. The abundant decay is used as the normalisation channel. The choice of mass fits over a full amplitude analysis is motivated by several considerations. The sharp mass peak provides a clear signal characteristic and the lineshape can be very well determined using the copious decays. On the other hand, inter- ference of the S-wave (either (980) or non-resonant) and P-wave amplitudes vanishes in the spectrum, up to negligible angular acceptance effects, after integrating over the angular variables. Furthermore, significant correlations observed between , and angular variables make it challenging to describe the mass-dependent angular distributions of both signal and background, which are required for an amplitude analysis. Finally, the power of the amplitude analysis in discriminating the signal from the non-contribution and background is reduced by the large number of parameters that need to be determined in the fit. In addition, a good understanding of the contamination from decays in the mass-region is essential in the search for .

II. DETECTOR AND SIMULATION
The LHCb detector [10, 11] is a single-arm forward spectrometer covering the pseudorapidity range , designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the interaction region, a large-area siliconstrip detector located upstream of a dipole magnet with a bending power of about , and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200 . The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of , where is the component of the momentum transverse to the beam, in . Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers. Samples of simulated decays are used to optimise the signal candidate selection and derive the efficiency of selection. In the simulation, collisions are generated using PYTHIA [12,13] with a specific LHCb configuration [14]. Decays of unstable particles are described by EVTGEN [15], in which final-state radiation is generated using PHOTOS [16]. The interaction of the generated particles with the detector, and its response, are implemented using the GEANT4 toolkit [17,18] as described in Ref. [19].

III. CANDIDATE SELECTION
The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. An inclusive approach for the hardware trigger is used to maximise the available data sample, as described in Ref. [20]. Since the centre-of-mass energies and trigger thresholds are different for the Run 1 and Run 2 data-taking, the offline selection is performed separately for the two periods, following the procedure described below. The resulting data samples for the two periods are treated separately in the subsequent analysis procedure.
The offline selection comprises two stages. First, a loose selection is used to reconstruct both and candidates in the same way, given their similar kinematics. Two oppositely charged muon candidates with are combined to form a candidate. The muon pair is required to have a common vertex and an invariant mass, , in the range 3020 -3170 . A pair of oppositely charged kaon candidates identified by the Cherenkov detectors is combined to form a candidate. The pair is required to have an invariant mass, , in the range 1000-1050 . The and candidates are combined to form a candidate, which is required to have good vertex quality and invariant mass, , in the range 5200-5550 . The resulting candidate is assigned to the PV with which it has the smallest , where is defined as the difference in the vertex-fit of a given PV reconstructed with and without the particle being considered. The invariant mass of the candidate is calculated from a kinematic fit for which the momentum vector of the candidates is aligned with the vector connecting the PV to the decay vertex and is constrained to the known meson mass [21]. In order to suppress the background due to the random combination of a prompt meson and a pair of charged kaons, the decay time of the candidate is required to be greater than 0.3 .
In a second selection stage, a boosted decision tree (BDT) classifier [22,23] is used to further suppress com- binatorial background. The BDT classifier is trained using simulated decays representing the signal, and candidates with in the range 5480 -5550 as background. Candidates in both samples are required to have passed the trigger and the loose selection described above. Using a multivariate technique [24], the simulation sample is corrected to match the observed distributions in backgroundsubtracted data, including that of the and pseudorapidity of the , the of the decay vertex, the of the decay chain of the candidate [25], the particle identification variables, the track-fit of the muon and kaon candidates, and the numbers of tracks measured simultaneously in both the vertex detector and tracking stations.
The input variables of the BDT classifier are the minimum track-fit of the muons and the kaons, the of the candidate and the combination, the of the decay vertex, particle identification probabilities for muons and kaons, the minimum of the muons and kaons, the of the decay vertex, the of the candidate, and the of the decay chain fit. The optimal requirement on the BDT response for the candidates is obtained by maximising the quantity , where is the signal efficiency determined in simulation and N is the number of candidates found in the region around the known mass [21].
In addition to combinatorial background, the data also contain fake candidates from ( ) decays, where the proton (pion) is misidentified as a kaon. To suppress these background sources, a candidate is rejected if its invariant mass, computed with one kaon interpreted as a proton (pion), lies within of the known ( ) mass [21] and if the kaon candidate also satisfies proton (pion) identification requirements.
0.99 ± 0.03 ± 0.03 0.99 ± 0.01 ± 0.02 A previous study of decays found that the yield of the background from decays is only 0.1% of the signal yield [20]. Furthermore, only 1.2% of these decays, corresponding to about one candidate (three candidates) in the Run 1 (Run 2) data sample, fall in the mass region 5265 -5295 , according to simulation. Thus this background is neglected. The fraction of events containing more than one candidate is 0.11% in Run 1 data and 0.70% in Run 2 data and these events are removed from the total data sample. The acceptance, trigger, reconstruction and selection efficiencies of the signal and normalization channels are determined using simulation, which is corrected for the efficiency differences with respect to the data. The ratio of the total efficiencies of and is estimated to be for Run 1 and for Run 2, where the first uncertainties are statistical and the second ones are associated with corrections to the simulation. The polarisation amplitudes are assumed to be the same in and decays. The systematic uncertainty associated with this assumption is found to be small and is neglected.

IV. MASS FITS
There is a significant correlation between and in decays, as illustrated in Fig. 2. Hence, the search for decays is carried out by performing sequential fits to the distributions of and . A fit to the distribution is used to estimate the yields of the background components in the regions around the and nominal masses. A subsequent simultaneous fit to the distributions of candidates falling in the two mass windows, with the background yields fixed to their values from the first step, is performed to estimate the yield of decays.
The probability density function (PDF) for the distribution of both the and decays is modelled by the sum of a Hypatia [26] and a Gaussian function sharing the same mean. The fraction, the width ratio between the Hypatia and Gaussian functions and the Hypatia tail parameters are determined from simulation. The shape of the background is described by a template obtained from simulation, while the combinatorial background is described by an exponential function with the slope left to vary. The PDFs of and decays share the same shape parameters, and the difference between the and masses is constrained to the known mass difference of [21]. An unbinned maximum-likelihood fit is performed in range 5220 -5480 for Run 1 and Run 2 data samples separately. The yield of is estimated from a fit to the mass distribution with one kaon interpreted as a proton. This yield is then constrained to the resulting estimate of ( ) in the mass fit for the Run 1 (Run 2). The distributions, superimposed by the fit results, are shown in Fig. 3. Table 1 lists the obtained yields of the and decays, the background and the combinatorial background in the full range as well as in the regions around the known and masses.
Assuming the efficiency is independent of , the meson lineshape from ( ) decays in the ( ) region is given by A ϕ where is a relativistic Breit-Wigner amplitude function [27] defined as The parameter m ( ) denotes the reconstructed (true) invariant mass, and are the mass and decay width of the meson, is the momentum in the ( ) rest frame, ( ) is the momentum of the kaons in the ( ) rest frame, is the orbital angular momentum between the and , is the Blatt-Weisskopf function, and d is the size of the decaying particle, which is set to be 1.5 0.3 fm   [28]. The amplitude squared is folded with a Gaussian resolution function G. For , has the form P R and depends on the momentum of the decay products [27].
As is shown in Fig. 2, due to the correlation between the reconstructed masses of and , the shape of the distribution strongly depends on the chosen range. The top two plots in Fig. 3 show the distributions for Run 1 and Run 2 separately, where a small signal can be seen on the tail of a large signal. Therefore, it is necessary to estimate the lineshape of the mass spectrum from decays in the region. The distribution of the tail leaking into the mass window can be effectively described by Eq. (1) with modified values of and , which are extracted from an unbinned maximum-likelihood fit to the simulation sample.
The non-contributions to ( ) decays include that from (980) [1] ( (980) [29]) and nonresonant in an S-wave configuration. The PDF for this contribution is given by where m is the invariant mass, is the known mass [21], is the Blatt-Weisskopf barrier factor of the meson, and represent the resonant ( (980) or (980)) and nonresonant amplitudes, and is a relative phase between them. The nonresonant amplitude is modelled as a constant function. The lineshape of the (980) ( (980)) resonance can be described by a Flatté function [30] considering the coupled channels ( ) and . The Flatté functions are given by a 0 for the (980) resonance and for the (980) resonance. The parameter denotes the pole mass of the resonance for both cases. The constants ( ) and are the coupling strengths of (980) ( (980)) to the ( ) and final states, respect- ively. The factors are given by the Lorentz-invariant phase space: The parameters for the (980) lineshape are , , and , determined by the Crystal Barrel experiment [31]; the parameters for the (980) lineshape are , , and , according to the previous analysis of decays [32].
For the background, no dependency of the shape on is observed in simulation. Therefore, a common PDF is used to describe the distributions in both the and regions. The PDF is modelled by a third-order Chebyshev polynomial function, obtained from the unbinned maximum-likelihood fit to the simulation shown in Fig. 4.
In order to study the shape of the combinatorial background in the region, a BDT requirement that strongly favours background is applied to form a background-dominated sample. Simulated and events are then injected into this sample with negative weights to subtract these contributions. The resulting distribution is shown in Fig. 5, which comprises a resonance contribution and random combinations, where the shape of the former is described by Eq. (1) and the latter by a second-order Chebyshev polynomial function. To validate the underlying assumptions of this procedure, the shape has been checked to be compatible in different mass regions and with different BDT requirements.
A simultaneous unbinned maximum-likelihood fit to the four distributions in both and regions of Run 1 and Run 2 data samples is performed. The resonance in decays is modelled by Eq. (1). The non-contribution to decays is described by Eq. (4). The tail of decays in the region is described by the extracted shape from simulation. The background and the combinatorial background are described by the shapes shown in Figs. 4 and 5, respectively. All shapes are common to the and regions, except that of the tail, which is only needed for the region. The mass and decay width of meson are constrained to their PDG values [21] while the width of the resolution function is allowed to vary in the fit. The pole mass of (980) ( (980)) and the coupling factors, including , , and , are fixed to their central values in the reference fit. The amplitude is allowed to vary freely, while the relative phase between the (980) ( (980)) and nonresonance amplitudes is constrained to ( ) degrees, which was de- Fig. 4. Distribution of in a simulation sample superimposed with a fit to a polynomial function.  [29], is the efficiency ratio given in Sec. III, is the ratio of the production fractions of and mesons in collisions, which has been measured at 7 to be in the LHCb detector acceptance [33]. The effect of increasing collision energy on is found to be negligible for 8 and a scaling factor of is needed for 13 [34]. The parameters , and are fixed to their central values in the baseline fit and their uncertainties are propagated to in the evaluation of systematic uncertainties.
The distributions in the and regions are shown in Fig. 6 for both Run 1 and Run 2 data samples. The branching fraction is found to be . The significance of the decay , over the background-only hypothesis, is estimated to be 2.3 standard deviations using Wilks' theorem [35].
To validate the sequential fit procedure, a large number of pseudosamples were generated according to the fit models for the and distributions. The model parameters were taken from the result of the baseline fit to the data. The fit procedure described above was applied to each pseudosample. The distributions of the obtained estimate of and the corresponding pulls are found to be consistent with the reference result, which indicates that the procedure has negligible bias and its uncertainty estimate is reliable. A similar check has been performed using pseudosamples generated with an alternative model for the decays, which is based on the amplitude model developed for the analysis [20] and includes contributions from P-wave decays, S-wave decays and their interference. In this case, the robustness of the fit method has also been confirmed.  Two categories of systematic uncertainties are considered: multiplicative uncertainties, which are associated with the normalisation factors; and additive uncertainties, which affect the determination of the yields of the and modes.
The multiplicative uncertainties include those propagated from the estimates of , and . Using the measurement at 7 [29,33], was measured to be . The third uncertainty is completely anti-correlated with the uncertainty on , since the estimate of is inversely proportional to the value used for . Taking this correlation into account yields for 7 . The luminosity-weighted average of the scaling factor for for 13 has a relative uncertainty of 3.4%. For the efficiency ratio , its luminosity-weighted average has a relative uncertainty of 1.8%. Summing these three contributions in quadrature gives a total relative uncertainty of 7.3% on .
The additive uncertainties are due to imperfect modeling of the and shapes of the signal and background components. To evaluate the systematic effect associated with the model of the combinatorial background, the fit procedure is repeated by replacing the exponential function for the combinatorial background with a second-order polynomial function. A large number of simulated pseudosamples were generated according to the obtained alternative model. Each pseudosample was fitted twice, using the baseline and alternative combinatorial shape, respectively. The average difference of is , which is taken as a systematic uncertainty.
In the fit, the yields of decay, combinatorial backgrounds under the and peaks, and that of the tail leaking into the region are fixed to the values in Table 1. Varying these yields separately leads to a change of by for , for the combinatorial background and for the tail in the region, and these are assigned as systematic uncertainties on . . The maximum change of is evaluated to be , which is taken as a systematic uncertainty.
The shape of the tail under the peak is extracted using a simulation sample. The statistical uncertainty due to the limited size of this sample is estimated using the bootstrapping technique [36]. A large number of new data sets of the same size as the original simulation sample were formed by randomly cloning events from the original sample, allowing one event to be cloned more than once. The spread in the results of obtained by using these pseudosamples in the analysis procedure is then adopted as a systematic uncertainty, which is evaluated to be .
In the reference model, the shape of the background is determined from simulation, under the assumption that this shape is insensitive to the region. A sideband sample enriched with contributions is selected by requiring one kaon to have a large probability to be a proton. An alternative shape is extracted from this sample after subtracting the random combinations, and used in the fit. The resulting change of is , which is assigned as a systematic uncertainty. The shape of the combinatorial background is represented by that of the combinations with a BDT selection that strongly favours the background over the signal, under the assumption that this shape is insensitive to the BDT requirement. Repeating the fit by using the combinatorial background shape obtained with two non-overlapping sub-intervals of BDT response, the result for is found to be stable, with a maximum variation of , which is regarded as a systematic uncertainty. , are fixed to their mean values from Ref. [31,32]. The fit is repeated by varying each factor by its experimental uncertainty and the maximum variation of the branching fraction is considered for each parameter. The sum of the variations in quadrature is , which is assigned as a systematic uncertainty.
The systematic uncertainties are summarised in Table 2. The total systematic uncertainty is the sum in quadrature of all these contributions.

→J/ψϕ)
A profile likelihood method is used to compute the upper limit of [37,38]. The profile likelihood ratio as a function of is defined as where represents the set of fit parameters other than , and are the maximum likelihood estimators, and is the profiled value of the parameter that maximises L for the specified . Systematic uncertainties are incorporated by smearing the profile likelihood ratio function with a Gaussian function which has a zero mean and a width equal to the total systematic uncertainty: The smeared profile likelihood ratio curve is shown in Fig. 7. The 90% confidence interval starting at is shown as the red area, which covers 90% of the integral of the function in the physical region. The obtained upper limit on at 90% CL is . A search for the rare decay has been performed using the full Run 1 and Run 2 data samples of collisions collected with the LHCb experiment, corresponding to an integrated luminosity of 9

VI. CONCLUSION
. A branching fraction of is measured, which indicates no statistically significant excess of the decay above the background-only hypothesis. The upper limit on its branching fraction at 90% CL is determined to be , which is compat- ible with theoretical expectations and improved compared with the previous limit of obtained by the LHCb experiment using Run 1 data, with a corresponding integrated luminosity of 1 .