Search for the rare decay $B^0 \rightarrow J/\psi \phi$

A search for the rare decay $B^0 \rightarrow J/\psi \phi$ is performed using $pp$ collision data collected with the LHCb detector at centre-of-mass energies of 7, 8 and 13 {Te\kern -0.1em V}, corresponding to an integrated luminosity of 9 ${\rm fb}^{-1}$. No significant signal of the decay is observed and an upper limit of $1.1 \times 10^{-7}$ at 90\% confidence level is set on the branching fraction.


Introduction
The B 0 → J/ψK + K − decay was first observed by the LHCb experiment with a branching fraction of (2.51 ± 0.35) × 10 −7 [1]. It proceeds primarily through the Cabibbo-suppressed b→ccd transition. The K + K − pair can come either directly from the B 0 decay via an ss pair created in the vacuum, or from the decay of intermediate states that contain both dd and ss components, such as the a 0 (980) resonance. 1 There is a potential contribution from the φ meson as an intermediate state. The decay B 0 → J/ψφ is suppressed by the Okubo-Zweig-Iizuka (OZI) rule that forbids disconnected quark diagrams [2][3][4]. The size of this contribution and the exact mechanism to produce the φ meson in this process are of particular theoretical interest [5][6][7]. Under the assumption that the dominant contribution is via a small dd component in the φ wave-function, arising from ω − φ mixing ( Fig. 1(a)), the branching fraction of the B 0 → J/ψφ decay is predicted to be of the order of 10 −7 [5]. Contributions to B 0 → J/ψφ decays from the OZI-suppressed tri-gluon fusion ( Fig. 1(b)), photoproduction and final-state rescattering are estimated to be at least one order of magnitude lower [7]. Experimental studies of the decay B 0 → J/ψφ could provide important information about the dynamics of OZI-suppressed decays.
No significant signal of B 0 → J/ψφ decay has been observed in previous searches by several experiments. Upper limits on the branching fraction of the decay have been set by BaBar [8], Belle [9] and LHCb [1]. The LHCb limit was obtained using a data sample corresponding to an integrated luminosity of 1 fb −1 of pp collision data, collected at centre-of-mass energy of 7 TeV. This paper presents an update on the search for B 0 → J/ψφ decays using a data sample corresponding to an integrated luminosity of 9 fb −1 , including 3 fb −1 collected at 7 and 8 TeV, denoted as Run 1, and 6 fb −1 collected at 13 TeV, denoted as Run 2.
The LHCb measurement in Ref. [1] is obtained from an amplitude analysis of B 0 → J/ψK + K − decays in a wide m(K + K − ) range from the K + K − mass threshold to 2200 MeV/c 2 . This paper focuses on the φ(1020) region, with the K + K + mass in the range 1000-1050 MeV/c 2 , and on studies of the J/ψK + K − and K + K − mass distributions to distinguish the B 0 → J/ψφ signal from the non-resonant decay B 0 → J/ψK + K − and background contaminations. The abundant decay B 0 s → J/ψφ is used as the normalisation channel. The choice of mass fits over a full amplitude analysis is motivated by several considerations. The sharp φ mass peak provides a clear signal characteristic and the lineshape can be very well determined using the copious B 0 s → J/ψφ decays. On the other hand, interference of the S-wave (either a 0 (980)/f 0 (980) or non-resonant) and P-wave amplitudes vanishes in the m(K + K − ) spectrum, up to negligible angular acceptance effects, after integrating over the angular variables. Furthermore, significant correlations observed between m(J/ψK + K − ), m(K + K − ) and angular variables make it challenging to describe the mass-dependent angular distributions of both signal and background, which are required for an amplitude analysis. Finally, the power of the amplitude analysis in discriminating the signal from the non-φ contribution and background is reduced by the large number of parameters that need to be determined in the fit. In addition, a good understanding of the contamination from B 0 s → J/ψK + K − decays in the B 0 mass-region is essential in the search for B 0 → J/ψφ.

Candidate selection
The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. An inclusive approach for the hardware trigger is used to maximise the available data sample, as described in Ref. [18]. Since the centre-of-mass energies and trigger thresholds are different for the Run 1 and Run 2 data-taking, the offline selection is performed separately for the two periods following the procedure described below. The resulting data samples for the two periods are treated separately in the subsequent analysis procedure.
The offline selection comprises two stages. First, a loose selection is used to reconstruct both B 0 → J/ψφ and B 0 s → J/ψφ candidates in the same way, given their similar kinematics. Two oppositely charged muon candidates with p T > 500 MeV/c are combined to form a J/ψ candidate. The muon pair is required to have a common vertex and an invariant mass, m(µ + µ − ), in the range 3020-3170 MeV/c 2 . A pair of oppositely charged kaon candidates identified by the Cherenkov detectors is combined to form a φ candidate. The K + K − pair is required to have an invariant mass, m(K + K − ), in the range 1000-1050 MeV/c 2 . The J/ψ and φ candidates are combined to form a B 0 (s) candidate, which is required to have good vertex quality and invariant mass, m(J/ψK + K − ), in the range 5200-5550 MeV/c 2 . The resulting B 0 (s) candidate is assigned to the PV with which it has the smallest χ 2 IP , where χ 2 IP is defined as the difference in the vertex-fit χ 2 of a given PV reconstructed with and without the particle being considered. The invariant mass of the B 0 (s) candidate is calculated from a kinematic fit for which the momentum vector of the B 0 (s) candidates is aligned with the vector connecting the PV to the B 0 (s) decay vertex and m(µ + µ − ) is constrained to the known J/ψ meson mass [19]. In order to suppress the background due to the random combination of a prompt J/ψ meson and a pair of charged kaons, the decay time of the B 0 (s) candidate is required to be greater than 0.3 ps. In a second selection stage, a boosted decision tree (BDT) classifier [20] is used to further suppress combinatorial background. The BDT classifier is trained using simulated B 0 s → J/ψφ decays representing the signal, and candidates with m(J/ψK + K − ) in the range 5480-5550 MeV/c 2 as background. Candidates in both samples are required to have passed the trigger and the loose selection described above. Using a multivariate technique [21], the B 0 s → J/ψφ simulation sample is corrected to match the observed distributions in background-subtracted data, including that of the p T and pseudorapidity of the B 0 s , the χ 2 IP of the B 0 s decay vertex, the χ 2 of the decay chain of the B 0 s candidate [22], the particle identification variables, the track-fit χ 2 of the muon and kaon candidates and the numbers of tracks measured simultaneously in both the vertex detector and tracking stations.
The input variables of the BDT classifier are the minimum track-fit χ 2 of the muons and the kaons, the p T of the B 0 (s) candidate and the K + K − combination, the χ 2 of the B 0 (s) decay vertex, particle identification probabilities for muons and kaons, the minimum χ 2 IP of the muons and kaons, the χ 2 of the J/ψ decay vertex, the χ 2 IP of the B 0 (s) candidate and the χ 2 of the B 0 (s) decay chain fit. The optimal requirement on the BDT response for the B 0 (s) candidates is obtained by maximising the quantity ε/ √ N , where ε is the signal efficiency determined in simulation and N is the number of candidates found in the ±15 MeV/c 2 region around the known B 0 mass [19].
In addition to combinatorial background, the data also contain fake candidates from where the proton (pion) is misidentified as a kaon. To suppress these background sources, a B 0 (s) candidate is rejected if its invariant mass, computed with one kaon interpreted as a proton (pion), lies within ±15 MeV/c 2 of the known Λ 0 b (B 0 ) mass [19] and if the kaon candidate also satisfies proton (pion) identification requirements.
A previous study of B 0 s → J/ψφ decays found that the yield of the background from B 0 → J/ψK + π − decays is only 0.1% of the B 0 s → J/ψφ signal yield [18]. Furthermore, only 1.2% of these decays, corresponding to about one candidate (three candidates) in the Run 1 (Run 2) data sample, fall in the B 0 mass region 5265-5295 MeV/c 2 , according to simulation. Thus this background is neglected. The fraction of events containing more than one candidate is 0.11% in Run 1 data and 0.70% in Run 2 data and these events are removed from the total data sample. The acceptance, trigger, reconstruction and selection efficiencies of the signal and normalization channels are determined using simulation, which is corrected for the efficiency differences with respect to the data. The ratio of the total efficiencies of B 0 → J/ψφ and B 0 s → J/ψφ is estimated to be 0.99 ± 0.03 ± 0.03 for Run 1 and 0.99 ± 0.01 ± 0.02 for Run 2, where the first uncertainties are statistical and the second ones are associated with corrections to the simulation. The polarisation amplitudes are assumed to be the same in B 0 → J/ψφ and B 0 s → J/ψφ decays. The systematic uncertainty associated with this assumption is found to be small and is neglected.

Mass fits
There is a significant correlation between m(J/ψK + K − ) and m(K Fig. 2, hence the search for B 0 → J/ψφ decays is carried out by performing sequential fits to the distributions of m(J/ψK + K − ) and m(K + K − ). A fit to the m(J/ψK + K − ) distribution is used to estimate the yields of the background components in the ±15 MeV/c 2 regions around B 0 s and B 0 nominal masses. A subsequent simultaneous fit to the m(K + K − ) distributions of candidates falling in the two J/ψK + K − mass windows, with the background yields fixed to their values from the first step, is performed to estimate the yield of B 0 → J/ψφ decays. The probability density function (PDF) for the m(J/ψK + K − ) distribution of both the B 0 → J/ψK + K − and B 0 s → J/ψK + K − decays is modelled by the sum of a Hypatia [23] and a Gaussian function sharing the same mean. The fraction, the width ratio between the Hypatia and Gaussian functions and the Hypatia tail parameters are determined from simulation. The m(J/ψK + K − ) shape of the Λ 0 b → J/ψpK − background is described by a template obtained from simulation, while the combinatorial background is described by    an exponential function with the slope left to vary. The PDFs of B 0 → J/ψK + K − and B 0 s → J/ψK + K − decays share the same shape parameters, and the difference between the B 0 s and B 0 masses is constrained to the known mass difference of 87.23 ± 0.16 MeV/c 2 [19]. An unbinned maximum-likelihood fit is performed in the m(J/ψK + K − ) range 5220-5480 MeV/c 2 for Run 1 and Run 2 data samples separately. The yield of Λ 0 b → J/ψpK − is estimated from a fit to the J/ψpK − mass distribution with one kaon interpreted as a proton. This yield is then constrained to the resulting estimate of 399 ± 26 (1914 ± 47) in the J/ψK + K − mass fit for the Run 1 (Run 2). The m(J/ψK + K − ) distributions, superimposed by the fit results, are shown in Fig. 3. Table 1 lists the obtained yields of the B 0 → J/ψK + K − and B 0 s → J/ψK + K − decays, the Λ 0 b background and the combinatorial background in the full range as well as in the ±15 MeV/c 2 regions around the known B 0 s and B 0 masses.
Assuming the efficiency is independent of m(K + K − ), the φ meson lineshape from  where A φ is a relativistic Breit-Wigner amplitude function [24] defined as The parameter m (m ) denotes the reconstructed (true) K + K − invariant mass, m 0 and Γ 0 are the mass and decay width of the φ(1020) meson, P B is the J/ψ momentum in the B 0 s (B 0 ) rest frame, P R (P 0 ) is the momentum of the kaons in the K + K − (φ(1020)) rest frame, L R is the orbital angular momentum between the K + and K − , F R is the Blatt-Weisskopf function and d is the size of decaying particle, which is set to be 1.5 (GeV/c) −1 ∼ 0.3 fm [25]. The amplitude squared is folded with a Gaussian resolution function G. For L R = 1, F R has the form and depends on the momentum of the decay products P R [24]. As is shown in Fig. 2, due to the correlation between the reconstructed masses of K + K − and J/ψK + K − , the shape of the m(K + K − ) distribution strongly depends on the chosen m(J/ψK + K − ) range. The top two plots in Fig. 3 show the m(J/ψK + K − ) distributions for Run 1 and Run 2 separately, where a small B 0 signal can be seen on the tail of a large B 0 s signal. Therefore, it is necessary to estimate the lineshape of the K + K − mass spectrum from B 0 s → J/ψφ decays in the B 0 region. The m(K + K − ) distribution of the B 0 s → J/ψφ tail leaking into the B 0 mass window can be effectively described by Eq. 1 with modified values of m 0 and Γ 0 , which are extracted from an unbinned maximum-likelihood fit to the B 0 s → J/ψφ simulation sample. The non-φ K + K − contributions to B 0 → J/ψK + K − (B 0 s → J/ψK + K − ) decays include that from a 0 (980) [1] (f 0 (980) [26]) and nonresonant K + K − in an S-wave configuration. The PDF for this contribution is given by where m is the K + K − invariant mass, m B is the known B 0 (s) mass [19], F B is Blatt-Weisskopf barrier factor of the B 0 (s) meson, A R and A N R represent the resonant (a 0 (980) or f 0 (980)) and nonresonant amplitudes and δ is a relative phase between them. The nonresonant amplitude A N R is modelled as a constant function. The lineshape of the a 0 (980) (f 0 (980)) resonance can be described by a Flatté function [27] considering the coupled channels ηπ 0 (ππ) and KK. The Flatté functions are given by for the a 0 (980) resonance and for the f 0 (980) resonance. The parameter m R denotes the pole mass of the resonance for both cases. The constants g ηπ (g ππ ) and g KK are the coupling strengths of a 0 (980) (f 0 (980)) to the ηπ 0 (ππ) and KK final states, respectively. The ρ factors are given by the Lorentz-invariant phase space The parameters for the a 0 (980) lineshape are m R = 0.999 ± 0.002 GeV/c 2 , g ηπ = 0.324 ± 0.015 GeV/c 2 , and g 2 KK /g 2 ηπ = 1.03 ± 0.14, determined by the Crystal Barrel experiment [28]; the parameters for the f 0 (980) lineshape are m R = 0.9399±0.0063 GeV/c 2 , g ππ = 0.199 ± 0.030 GeV/c 2 , and g KK /g ππ = 3.0 ± 0.3, according to the previous analysis of B 0 s → J/ψπ + π − decays [29]. For the Λ 0 b → J/ψpK − background, no dependency of the m(K + K − ) shape on m(J/ψK + K − ) is observed in simulation. Therefore, a common PDF is used to describe the m(K + K − ) distributions in both the B 0 s and B 0 regions. The PDF is modelled by a third-order Chebyshev polynomial function, obtained from the unbinned maximumlikelihood fit to the simulation shown in Fig. 4.
In order to study the m(K + K − ) shape of the combinatorial background in the B 0 region, a BDT requirement that strongly favours background is applied to form a background-dominated sample. Simulated Λ 0 b → J/ψpK − and B 0 s → J/ψφ events are then injected into this sample with negative weights to subtract these contributions. The resulting m(K + K − ) distribution is shown in Fig. 5, which comprises a φ resonance contribution and random K + K − combinations, where the shape of the former is described by Eq. 1 and the latter by a second-order Chebyshev polynomial function. To validate the underlying assumptions of this procedure, the m(K + K − ) shape has been checked to be compatible in different J/ψK + K − mass regions and with different BDT requirements.  A simultaneous unbinned maximum-likelihood fit to the four m(K + K − ) distributions in both B 0 s and B 0 regions of Run 1 and Run 2 data samples is performed. The φ resonance in B 0 (s) → J/ψφ decays is modelled by Eq. 1. The non-φ K + K − contribution to B 0 (s) → J/ψK + K − decays is described by Eq. 4. The tail of B 0 s → J/ψφ decays in the B 0 region is described by the extracted shape from simulation. The Λ 0 b background and the combinatorial background are described by the shapes shown in Figs. 4 and 5, respectively. All m(K + K − ) shapes are common to the B 0 and B 0 s regions, except that of the B 0 s tail, which is only needed for the B 0 region. The mass and decay width of φ(1020) meson are constrained to their PDG values [19] while the width of the m(K + K − ) resolution function is allowed to vary in the fit. The pole mass of f 0 (980) (a 0 (980)) and the coupling factors, including g ππ , g KK /g ππ , g 2 ηπ and g 2 KK /g 2 ηπ , are fixed to their central values in the reference fit. The amplitude A N R is allowed to vary freely, while the relative phase δ between the f 0 (980) (a 0 (980)) and nonresonance amplitudes is constrained to −255 ± 35 (−60 ± 26) degrees, which was determined in the amplitude analysis of B 0 s → J/ψK + K − (B 0 → J/ψK + K − ) decays [1,26]. The yields of the Λ 0 b background, the B 0 s → J/ψφ tail leaking into the B 0 region and combinatorial background are fixed to the corresponding values in Table 1, while the yields of non-φ K + K − for B 0 s and B 0 decays as well as the yield of B 0 s → J/ψφ decays take different values for Run 1 and Run 2 data samples and are left to vary in the fit. The branching fraction B(B 0 → J/ψφ), the parameter of interest to be determined by the fit, is common for the Run 1 and Run 2. The yield of B 0 → J/ψφ decays is internally expressed according to where the branching fraction B(B 0 s → J/ψφ) has been measured by the LHCb collaboration [26], ε B 0 /ε B 0 s is the efficiency ratio given in Sec. 3, f s /f d is the ratio of the production fractions of B 0 s and B 0 mesons in pp collisions, which has been measured at 7 TeV to be 0.256±0.020 in the LHCb detector acceptance [30]. The effect of increasing collision energy on f s /f d is found to be negligible for 8 TeV and a scaling factor of 1.068 ± 0.046 is needed for 13 TeV [31].    The m(K + K − ) distributions in the B 0 s and B 0 regions are shown in Fig. 6 for both Run 1 and Run 2 data samples. The branching fraction B(B 0 → J/ψφ) is found to be (6.8±3.0(stat.))×10 −8 . The significance of the decay B 0 → J/ψφ, over the background-only hypothesis, is estimated to be 2.3 standard deviations using Wilks' theorem [32].
To validate the sequential fit procedure, a large number of pseudosamples are generated according to the fit models for the m(J/ψK + K − ) and m(K + K − ) distributions. The model parameters are taken from the result of the baseline fit to the data. The fit procedure described above is applied to each pseudosample. The distributions of the obtained estimate of B(B 0 → J/ψφ) and the corresponding pulls are found to be consistent with the reference result, which indicates that the procedure has negligible bias and its uncertainty estimate is reliable. A similar check has been performed using pseudosamples generated with an alternative model for the B 0 → J/ψK + K − decays, which is based on the amplitude model developed for the B 0 s → J/ψK + K − analysis [18] and includes contributions from P-wave B 0 → J/ψφ decays, S-wave B 0 → J/ψK + K − decays and their interference. In this case, the robustness of the fit method has also been confirmed. TeV has a relative uncertainty of 3.4%. For the efficiency ratio ε B 0 s /ε B 0 , its luminosity-weighted average has a relative uncertainty of 1.8%. Summing these three contributions in quadrature gives a total relative uncertainty of 7.3% on B(B 0 → J/ψφ).

Systematic uncertainties
The additive uncertainties are due to imperfect modeling of the m(J/ψK + K − ) and m(K + K − ) shapes of the signal and background components. To evaluate the systematic effect associated with the m(J/ψK + K − ) model of the combinatorial background, the fit procedure is repeated by replacing the exponential function for the combinatorial background with a second-order polynomial function. A large number of simulated pseudosamples are generated according to the obtained alternative model. Each pseudosample is fitted twice, using the baseline and alternative combinatorial shape, respectively. The average difference of B(B 0 → J/ψφ) is 0.03 × 10 −8 , which is taken as a systematic uncertainty.
In the m(K + K − ) fit, the yields of Λ 0 b → J/ψpK − decay, combinatorial backgrounds under the B 0 and B 0 s peaks and that of the B 0 s tail leaking into the B 0 region are fixed to the values in Table 1 The m(K + K − ) shape of the B 0 s tail under the B 0 peak is extracted using a B 0 s → J/ψφ simulation sample. The statistical uncertainty due to the limited size of this sample is estimated using the bootstrapping technique [33]. A large number of new data sets of the same size as the original simulation sample are formed by randomly cloning events from the original sample, allowing one event to be cloned more than once. The spread on the results of B(B 0 → J/ψφ) obtained by using these pseudosamples in the analysis procedure is then adopted as a systematic uncertainty, which is evaluated to be 0.29 × 10 −8 . In the reference model, the m(K + K − ) shape of the Λ 0 b → J/ψpK − background is determined from simulation, under the assumption that this shape is insensitive to the m(J/ψK + K − ) region. A sideband sample enriched with Λ 0 b → J/ψpK − contributions is selected by requiring one kaon to have a large probability to be a proton. An alternative m(K + K − ) shape is extracted from this sample after subtracting the random combinations, and used in the m(K + K − ) fit. The resulting change of B(B 0 → J/ψφ) is 0.28 × 10 −8 , which is assigned as a systematic uncertainty.
The m(K + K − ) shape of the combinatorial background is represented by that of the J/ψK + K − combinations with a BDT selection that strongly favours the background over the signal, under the assumption that this shape is insensitive to the BDT requirement. Repeating the m(K + K − ) fit by using the combinatorial background shape obtained with two non-overlapping sub-intervals of BDT response, the result of B(B 0 → J/ψφ) is found to be stable, with a maximum variation of 0.16 × 10 −8 , which is regarded as a systematic uncertainty.
In Equations 7-9, the coupling factors g ηπ , g 2 KK /g 2 ηπ , g ππ and g KK /g ππ , are fixed to their mean values from Ref. [28,29]. The fit is repeated by varying each factor by its experimental uncertainty and the maximum variation of the branching fraction is considered for each parameter. The sum of the variations in quadrature is 0.06 × 10 −8 , which is assigned as a systematic uncertainty.
The systematic uncertainties are summarised in Table 2. The total systematic uncertainty is the sum in quadrature of all these contributions.
A profile likelihood method is used to compute the upper limit of B(B 0 → J/ψφ) [34,35]. The profile likelihood ratio as a function of B ≡ B(B 0 → J/ψφ) is defined as where ν represents the set of fit parameters other than B, B and ν are the maximum likelihood estimators, and ν is the profiled value of the parameter ν that maximises L for the specified B. Systematic uncertainties are incorporated by smearing the profile likelihood ratio function with a Gaussian function which has a zero mean and a width equal to the total systematic uncertainty: The smeared profile likelihood ratio curve is shown in Fig. 7. The 90% confidence interval starting at B = 0 is shown as the red area, which covers 90% of the integral of the λ(B) function in the physical region. The obtained upper limit on B(B 0 → J/ψφ) at 90% CL is 1.1 × 10 −7 .

Conclusion
A search for the rare decay B 0 → J/ψφ is performed using the full Run 1 and Run 2 data samples of pp collisions collected with the LHCb experiment, corresponding to an integrated luminosity of 9 fb −1 . A branching fraction of B(B 0 → J/ψφ) = (6.8 ± 3.0 ± 0.9) × 10 −8 is measured, which indicates no statistically significant excess of the decay B 0 → J/ψφ above the background-only hypothesis. The upper limit on its branching fraction at 90% CL is determined to be 1.1 × 10 −7 , which is compatible with theoretical expectations and improved compared with the previous limit of 1.9 × 10 −7 obtained by the LHCb experiment using Run 1 data, with a corresponding integrated luminosity of 1 fb −1 .