Pathloss-based non-Line-of-Sight Identification in an Indoor Environment: An Experimental Study

This paper reports the findings of an experimental study on the problem of line-of-sight (LOS)/non-line-of-sight (NLOS) classification in an indoor environment. Specifically, we deploy a pair of NI 2901 USRP software-defined radios (SDR) in a large hall. The transmit SDR emits an unmodulated tone of frequency 10 KHz, on a center frequency of 2.4 GHz, using three different signal-to-noise ratios (SNR). The receive SDR constructs a dataset of pathloss measurements from the received signal as it moves across 15 equi-spaced positions on a 1D grid (for both LOS and NLOS scenarios). We utilize our custom dataset to estimate the pathloss parameters (i.e., pathloss exponent) using the least-squares method, and later, utilize the parameterized pathloss model to construct a binary hypothesis test for NLOS identification. Further, noting that the pathloss measurements slightly deviate from Gaussian distribution, we feed our custom dataset to four machine learning (ML) algorithms, i.e., linear support vector machine (SVM) and radial basis function SVM (RBF-SVM), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and logistic regression (LR). It turns out that the performance of the ML algorithms is only slightly superior to the Neyman-Pearson-based binary hypothesis test (BHT). That is, the RBF-SVM classifier (the best performing ML classifier) and the BHT achieve a maximum accuracy of 88.24% and 87.46% for low SNR, 83.91% and 81.21% for medium SNR, and 87.38% and 86.65% for high SNR.


I. INTRODUCTION
The upcoming 6G cellular standard aims to provide an immersive and personalized user experience by enabling a wide range of novel location-based applications, including augmented reality (AR), virtual reality (VR), and mixed reality (MR).To this end, precise indoor localization is the prerequisite to realize such applications, which will allow seamless integration of virtual and physical environments, enable precise positioning of virtual objects, and deliver contextaware services to the users [1].
Indoor localization is a challenging task due to the lack of global positioning system (GPS) signals indoors, due to the presence of obstacles/blockages, multipath, and random signal variations due to rich scattering in indoor environments.To date, numerous indoor propagation models and various methods for indoor localization have been reported in literature to examine and to undo the impact of non-idealities (e.g., multi-path, blockages) [2].Some popular methods for indoor localization include the following: fingerprinting (scene analysis) based, time of arrival (ToA) based, angle of arrival (AoA) based, phase of arrival (PoA) based, time of flight (ToF) based, time difference of arrival (TDoA) based, and received signal strength (RSS) based, Ricean k-factor based [2].
This work focuses on the challenge posed by the blockages to the indoor localization systems.Specifically, blockages turn a link into a non-line-of-sight (NLOS) link, which in turn makes the distance/AoA estimates obtained by the indoor localization algorithms biased.Thus, NLOS conditions when exist, degrade the accuracy of the indoor positioning systems due to the ranging errors.Therefore, accurate NLOS prediction/classification is the need of the hour.NLOS prediction helps indoor positioning systems identify and mitigate the effects of NLOS conditions, and thus could lead to a boost in the accuracy of the indoor position estimates [3].Other than indoor localization, NLOS identification could also help solve many other important problems, e.g., it could help discover blocked THz links indoors, which might prompt a THz access point to provide service to the associated users by means of a reconfigurable intelligent surface (RIS) panel, therefore, improving the coverage of the indoor THz link [4].NLOS identification, thus, provides valuable insights for the design of blockage-aware user association algorithms and handover management algorithms.
The problem of NLOS identification has recently caught attention by the research community, and a number of works have been reported in the literature, to date.Thus, the discussion of the selected related works is in order.[5] utilizes a WiFi system to collect channel frequency response (CFR) and channel impulse response (CIR) samples, extracts a number of statistical features (e.g., mean, variance, skew, kurtosis, etc.) from the fine-grained channel state information (CSI) and feeds them to a support vector machine that does the NLOS identification.Authors of [6] consider an ultra-wideband system and use a semi-supervised learning approach, i.e., they utilize the expectation maximization algorithm to learn the parameters of their Gaussian mixture model for NLOS identification.The work [7] extracts a number of features (e.g., AoA, ToA, RSS, etc.) from the incoming received signal and utilizes various methods (e.g., Neyman-Pearson method) from the statistical decision theory in order to identify the NLOS conditions.The authors in [8] collect RSS samples using an indoor WiFi system and extract multiple statistical features from the RSS time series in order to feed them to a least squares support vector machine and to a hypothesis test which do NLOS identification.They further do NLOS mitigation by designing various distance estimation algorithms under both line-of-sight (LOS) and NLOS conditions.
Recently, a few researchers have proposed machine learning (ML) and deep learning methods for NLOS identification.For example, the authors in [9] implement a recurrent neural network (RNN) model that utilizes the CSI measurements collected in an indoor office environment, in order to identify the NLOS condition.[10] studies the problem of ultrawideband based wireless ranging, and utilizes a support vector machine, a random forest classifier and a multi-layer perceptron to solve the three-class classification problem with the following classes: LOS, NLOS, and multipath.The authors in [11] propose feature-based Gaussian distribution method and generalized Gaussian distribution method for NLOS detection under the constraint of an imbalanced dataset (with very few examples from the NLOS class).The authors in [12] propose a novel algorithm for LOS/NLOS classification based on a multi-layer perceptron that utilizes both manually extracted features as well as the features obtained from a convolutional neural network (CNN) using raw CIR inputs.Last but not the least, the authors in [13] study the problem of localization in a millimeter-wave wireless communication system, and train and test a two-stage unsupervised ML model on CSI data in order to classify LOS/NLOS.
On the prototyping front, there are a handful of works that report experimental results on indoor localization [14][15][16].For example, the authors in [14] use a bluetooth low energy (BLE) module to do indoor localization via different approaches, i.e., trilateration, dead reckoning, and the fusion method.Further, an experimental study that investigates the relation between the accuracy and energy consumption in a WiFi fingerprinting-based indoor localization system is proposed in [15].Finally, ML-assisted indoor localization is discussed in [16] where support vector regression (SVR) is done on CFR measurements obtained via a BLE module, in order to accomplish indoor localization in a multipath environment.
Contributions.This is an experimental study where we do an extensive data collection campaign via a pair of NI 2901 USRP software-defined radios in order to collect pathloss measurements in 5G FR1 band in an indoor setting.We first apply a least-squares method on the pathloss measurements in order to parameterize the pathloss model which is later utilized to construct a Neyman-Pearson-based binary hypothesis test.Further, noting that the pathloss measurements slightly deviate from the Gaussian distribution, we apply following four machine learning algorithms to the experimental data collected: linear support vector machine (SVM) and radial basis function SVM (RBF-SVM), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and logistic regression (LR).It turns out that the performance of the best-performing ML algorithm (i.e., RBF-SVM) is only slightly superior than its counterpart from statistical decision theory, i.e., binary hypothesis test.
Outline.The rest of this paper is organized as follows.Section II describes the experimental setup and the data collection process.Section III presents the two proposed methods for NLOS identification in detail.Section IV provides some selected results.Section V concludes the paper.As can be seen in Fig. 1, the receiver was placed at P = 15 different positions on a linear grid with inter-position spacing of 60 cm.The minimum transmit-receive spacing is 125 cm as per (10 wavelengths) requirement for the receiver to be in far field of the transmitter, while the maximum transmitreceive spacing is 900 cm.Directional (Horn) antennas with a maximum gain of 20 dB each were used at both ends (this helped reduce the impact of multipath for the LOS measurements).Center frequency f c was set to 2.4 GHz (i.e., the ISM band), while the sampling rate of both the transmit and the receive SDR was set to 200K samples/s.For both LOS and NLOS scenarios, measurements were taken for three different signal-to-noise ratio (SNR) conditions by changing the normalized amplitudes A t of the transmitted signal in the following range: 0.4,0.5,0.6.The transmit node sent a unmodulated tone of frequency 10 KHz.The channel was considered to be time-slotted with a slot length of 10 ms (large enough so that all the multipath components could be lumped together in one slot).The received signal directly provided the instant RSS measurements.Subsequently, the instant RSS sample within a timeslot were averaged to get a more stable and reliable RSS estimate.Averaging also helped us get rid of the small-scale fading occurring on a relatively fast time-scale.The averaged RSS measurements were then translated into the pathloss measurements using the Friis equation assuming that the antenna gains on both ends as well as the transmit power is known.That is, pathloss = Pt Pr where P t is the known transmit signal power, while P r = (RSS) 2 is the received signal power.The pathloss measurements were then used to construct a least-squares (LS) problem where the pathloss exponent α was computed for both LOS and NLOS scenarios, for each of three link conditions.A total of N = 5000 measurements were obtained for each of the 15 receiver positions for both LOS and NLOS scenarios (in order to construct a balanced dataset), for three different SNR values.
Feasibility of pathloss as core feature for NLOS identification.Fig. 2 plots the pathloss measurements that we obtained by moving the SDR receiver on a 1D grid during our data collection campaign, for both LOS and NLOS scenarios.Fig. 2 attests to the fact that the pathloss (exponent α) is higher for the NLOS scenario, compared to the LOS scenario (as is well-known in the literature).

III. THE PROPOSED METHODS
We first describe our binary hypothesis test for NLOS identification in detail.We then discuss the essentials of the four machine learning classifiers that we have implemented for NLOS identification.

A. NLOS Identification via Binary Hypothesis Testing
The binary hypothesis testing method for NLOS identification requires the measurements of pathloss conditioned on the two hypotheses, i.e., LOS and NLOS.Therefore, we first present a least-squares method for the estimation of the pathloss parameters.We then design a binary hypothesis test for NLOS identification and compute the two error probabilities (i.e., false alarm rate and missed-detection rate).
1) Least-Squares Estimation of the Pathloss Parameters: The Friis equation is: α where P r is the received power, P t is the transmit power, G t is the transmit antenna gain, G r is the receive antenna gain, λ = c fc is the wavelength, c is the speed of light, f c is the center frequency, d is the separation between the transmitting node and the receiving node, α is the pathloss exponent.Re-arranging Friis equation, we obtain the following distance-dependent pathloss model: Equivalently, in dB scale, we have: where A = −10 log 10 (G t G r ) and B(d) = 4πd λ .As mentioned earlier, in this work, we collect noisy measurements of instant RSS, square them to translate them into instant P r measurements which are further translated into pathloss measurements by multiplying 1/P r with the (known) P t .Then, the least-squares (LS) estimate of A and α is: Θ = X T (XX T ) −1 y where y ∈ R Table I summarizes the vector of unknowns Θ estimated via the LS method for both LOS and NLOS scenarios, for three different SNRs.2) Binary Hypothesis Test: With pathloss parameters in hand, we have the following binary hypothesis test (BHT) for NLOS identification (assuming Gaussian measurement noise): where z is the pathloss measurement, n ∼ N (0, σ 2 ) is measurement error.Let m 0 = A LOS + α LOS 10 log 10 B and and z|H 1 ∼ N (m 1 , σ 2 ).This translates to the following loglikelihood ratio test (LLRT): where η = ln(π(0)/π(1)).
Then, the probability of false alarm is given as: PFA

B. NLOS Identification via Machine Learning Classifiers
We note that the pathloss measurements collected in realtime setup via an SDR-pair slightly deviate from the Gaussian distribution (see Fig. 3).Therefore, the binary hypothesis test defined above that assumes Gaussian distribution for the measurement error may not work very well.However, it is well-known that the machine learning algorithms can cope with this situation (model mismatch) by learning the distribution from the training data.Therefore, we implement the following machine learning algorithms in Python: linear support vector machine (SVM) and radial basis function SVM, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and logistic regression (LR).We train and test the four ML classifiers on our custom dataset with a trainvalidation-test split of 70-15-15 (%).

IV. RESULTS
Receiver operating characteristic (ROC) curves are one popular metric to evaluate the performance of ML and statistical classifiers.An ROC curve plots the correct decision rate against the error rate, i.e., true positive rate (i.e., deciding NLOS correctly) vs. false alarm/positive rate (i.e., deciding NLOS while it was LOS).Fig. 4 shows the ROC curves for all the four ML classifiers as well as the BHT (an statistical classifier), for three different link conditions.We make the following observations.1) At low false alarm rates, the BHT performs the best among all the classifiers.But then, there is a switching mechanism in force where beyond a certain false positive rate, the ML classifiers outperform the BHT.
2) To our surprise, an increase in SNR doesn't lead to a monotonous increase in the accuracy of all the proposed NLOS identification methods.This is probably due to residual effects of multipath, small-scale fading and additive noise, and calls for more measurements for each receiver position, and longer time-slot intervals so that we get more stable pathloss measurements due to increased averaging.
Table II evaluates the NLOS identification performance of the BHT and the four ML classifiers based on the following three performance metrics: (a) probability of false alarm (PFA), (b) probability of missed detection (PMD), and (c) accuracy, where accuracy= (1−(PMD+PFA))×100.We make the following observations.1) The performance of the bestperforming ML algorithm (i.e., the RBF-SVM classifier) is only slightly superior to the Neyman-Pearson-based BHT.That is, the RBF-SVM classifier (the best performing ML classifier) and the BHT achieve a maximum accuracy of 88.24% and 87.46% for low SNR, 83.91% and 81.21% for medium SNR, and 87.38% and 86.65% for high SNR.2) Some ML classifiers (e.g., QDA) perform worst for one error type (i.e., false alarm rate) but perform best for the other error type (i.e.missed detection rate), and vice versa (e.g., LR).However, BHT and RBF-SVM are efficient in the sense that they minimize both error types simultaneously.3) Again, to our surprise, an increase in SNR doesn't lead to a monotonous increase in the accuracy of all the proposed NLOS identification methods (due to insufficient averaging while obtaining pathloss measurements).

V. CONCLUSION
This paper conducted an experimental study on the problem of LOS/NLOS classification in an indoor environment.We used a pair of NI 2901 USRP SDRs in a large hall (with receive SDR moving on a 1D grid) in order to construct a dataset of pathloss measurements (for both and NLOS scenarios).We utilized our custom dataset to estimate the pathloss parameters (i.e., pathloss exponent) using the leastsquares method, and later, utilized the parameterized pathloss model to construct a binary hypothesis test for NLOS identification.Further, noting that the pathloss measurements slightly deviate from the Gaussian distribution, we passed our custom dataset to four ML algorithms, i.e., linear and radial basis function SVM, LDA, QDA, and LR.We observed that the best-performing ML algorithm (i.e., RBF-SVM) marginally outperformed the Neyman-Pearson-based binary hypothesis test.
As for the future work, we note that the ML-based techniques are environment-specific, i.e., if the environment changes, we need to train the ML algorithms again.So, one promising future direction is to design reinforcement/online learning methods for NLOS identification.
II. EXPERIMENTAL SETUP & DATA COLLECTION We performed our data collection experiments in 5G FR1 band by deploying a pair of NI 2901 USRP software-defined radios (SDR) in one of the research labs at the Information Technology University (ITU), Lahore, Pakistan.The detailed layout of the room where we conducted our experiments is shown in Fig. 1.

Fig. 1 .
Fig. 1.The experimental setup (not to scale).The receiver (blue circle) is placed at 15 different positions on a 1D grid.The transmitter is either in LOS of the receiver (green triangle), or, in NLOS condition (red triangle).

Fig. 2 .
Fig. 2. Pathloss measurements obtained via our experimental setup consisting of two NI 2901 USRP SDRs when the receive SDR moves across 15 equispaced positions on a 1D grid.

Fig. 3 .
Fig. 3. Pathloss histogram does not fits well to normal distribution (for both LOS and NLOS scenarios).

TABLE I PATHLOSS
PARAMETERS ESTIMATED VIA LEAST-SQUARES METHOD

TABLE II PERFORMANCE
OF THE BHT AND THE ML CLASSIFIERS