Signal Processing
- [1] arXiv:2405.19336 [pdf, ps, other]
-
Title: Image-based retrieval of all-day cloud physical parameters for FY4A/AGRI and its application over the Tibetan PlateauZhijun Zhao (1, 2), Feng Zhang (1, 2), Wenwen Li (1), Jingwei Li (1, 2) ((1) CMA-FDU Joint Laboratory of Marine Meteorology, Department of Atmospheric and Oceanic Sciences, Institutes of Atmospheric Sciences, Fudan University, China, (2) Key Laboratory for Information Science of Electromagnetic Waves, Ministry of Education, School of Information Science and Technology, Fudan University, China)Subjects: Signal Processing (eess.SP)
Satellite remote sensing serves as a crucial means to acquire cloud physical parameters. However, existing official cloud products derived from the advanced geostationary radiation imager (AGRI) onboard the Fengyun-4A geostationary satellite suffer from limitations in computational precision and efficiency. In this study, an image-based transfer learning model (ITLM) was developed to realize all-day and high-precision retrieval of cloud physical parameters using AGRI thermal infrared measurements and auxiliary data. Combining the observation advantages of geostationary and polar-orbiting satellites, ITLM was pre-trained and transfer-trained with official cloud products from advanced Himawari imager (AHI) and Moderate Resolution Imaging Spectroradiometer (MODIS), respectively. Taking official MODIS products as the benchmarks, ITLM achieved an overall accuracy of 79.93% for identifying cloud phase and root mean squared errors of 1.85 km, 6.72 um, and 12.79 for estimating cloud top height, cloud effective radius, and cloud optical thickness, outperforming the precision of official AGRI and AHI products. Compared to the pixel-based random forest model, ITLM utilized the spatial information of clouds to significantly improve the retrieval performance and achieve more than a 6-fold increase in speed for a single full-disk retrieval. Moreover, the AGRI ITLM products with spatiotemporal continuity and high precision were used to accurately describe the spatial distribution characteristics of cloud fractions and cloud properties over the Tibetan Plateau (TP) during both daytime and nighttime, and for the first time provide insights into the diurnal variation of cloud cover and cloud properties for total clouds and deep convective clouds across different seasons.
- [2] arXiv:2405.19338 [pdf, ps, other]
-
Title: Accurate Patient Alignment without Unnecessary Imaging Dose via Synthesizing Patient-specific 3D CT Images from 2D kV ImagesYuzhen Ding, Jason M. Holmes, Hongying Feng, Baoxin Li, Lisa A. McGee, Jean-Claude M. Rwigema, Sujay A. Vora, Daniel J. Ma, Robert L. Foote, Samir H. Patel, Wei LiuComments: 17 pages, 8 figures and tablesSubjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging(OBI) unavailable. But tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT(CBCT), the field of view(FOV) of CBCT is limited with unnecessarily high imaging dose, thus unfavorable for pediatric patients. A solution to this dilemma is to reconstruct 3D CT from kV images obtained at the treatment position. Here, we propose a dual-models framework built with hierarchical ViT blocks. Unlike a proof-of-concept approach, our framework considers kV images as the solo input and can synthesize accurate, full-size 3D CT in real time(within milliseconds). We demonstrate the feasibility of the proposed approach on 10 patients with head and neck (H&N) cancer using image quality(MAE: <45HU), dosimetrical accuracy(Gamma passing rate (2%/2mm/10%)>97%) and patient position uncertainty(shift error: <0.4mm). The proposed framework can generate accurate 3D CT faithfully mirroring real-time patient position, thus significantly improving patient setup accuracy, keeping imaging dose minimum, and maintaining treatment veracity.
- [3] arXiv:2405.19340 [pdf, ps, other]
-
Title: Obtaining physical layer data of latest generation networks for investigating adversary attacksSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)
The field of machine learning is developing rapidly and is being used in various fields of science and technology. In this way, machine learning can be used to optimize the functions of latest generation data networks such as 5G and 6G. This also applies to functions at a lower level. A feature of the use of machine learning in the radio path for targeted radiation generation in modern ultra-massive MIMO, reconfigurable intelligent interfaces and other technologies is the complex acquisition and processing of data from the physical layer. Additionally, adversarial measures that manipulate the behaviour of intelligent machine learning models are becoming a major concern, as many machine learning models are sensitive to incorrect input data. To obtain data on attacks directly from processing service information, a simulation model is proposed that works in conjunction with machine learning applications.
- [4] arXiv:2405.19341 [pdf, ps, html, other]
-
Title: Spatial Impulse Response Analysis and Ensemble Learning for Efficient Precision Level SensingBerkay Cetkin, Lejla Begic Fazlic, Kristof Ueding, Rüdiger Machhamer, Achim Guldner, Lars Creutz, Stefan Naumann, Guido DartmannSubjects: Signal Processing (eess.SP)
In this paper, we propose an innovative method for determining the fill level of containers, such as trash cans, addressing a critical aspect of waste management. The method combines spatial impulse response analysis with machine learning techniques, offering a unique and effective approach for sound-based classification that can be extended to various domains beyond waste management. By employing a buzzer-generated sine sweep signal, we create a distinctive signature specific to the fill level of the waste container. This signature is then interpreted by a specially developed ensemble learning algorithm. Our approach achieves a classification accuracy of over 90% when implemented locally on a development board, eliminating the need to delegate complex classification tasks to external entities. Using low-cost and energy-efficient hardware components, our method offers a cost-effective approach that contributes to sustainable and efficient waste management practices, providing a reliable and locally deployable solution.
- [5] arXiv:2405.19345 [pdf, ps, html, other]
-
Title: Review of Deep Representation Learning Techniques for Brain-Computer Interfaces and RecommendationsComments: Submitted to: Journal of Neural Engineering (JNE)Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
In the field of brain-computer interfaces (BCIs), the potential for leveraging deep learning techniques for representing electroencephalogram (EEG) signals has gained substantial interest. This review synthesizes empirical findings from a collection of articles using deep representation learning techniques for BCI decoding, to provide a comprehensive analysis of the current state-of-the-art. Each article was scrutinized based on three criteria: (1) the deep representation learning technique employed, (2) the underlying motivation for its utilization, and (3) the approaches adopted for characterizing the learned representations. Among the 81 articles finally reviewed in depth, our analysis reveals a predominance of 31 articles using autoencoders. We identified 13 studies employing self-supervised learning (SSL) techniques, among which ten were published in 2022 or later, attesting to the relative youth of the field. However, at the time being, none of these have led to standard foundation models that are picked up by the BCI community. Likewise, only a few studies have introspected their learned representations. We observed that the motivation in most studies for using representation learning techniques is for solving transfer learning tasks, but we also found more specific motivations such as to learn robustness or invariances, as an algorithmic bridge, or finally to uncover the structure of the data. Given the potential of foundation models to effectively tackle these challenges, we advocate for a continued dedication to the advancement of foundation models specifically designed for EEG signal decoding by using SSL techniques. We also underline the imperative of establishing specialized benchmarks and datasets to facilitate the development and continuous improvement of such foundation models.
- [6] arXiv:2405.19346 [pdf, ps, html, other]
-
Title: Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery ClassificationComments: Early Accepted at MICCAI 2024Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In contrast, resting state (RS) EEG signals are a viable alternative due to ease of acquisition with rich subject information. In this paper, we propose a novel subject-adaptive transfer learning strategy that utilizes RS EEG signals to adapt models on unseen subject data. Specifically, we disentangle extracted features into task- and subject-dependent features and use them to calibrate RS EEG signals for obtaining task information while preserving subject characteristics. The calibrated signals are then used to adapt the model to the target subject, enabling the model to simulate processing TS EEG signals of the target subject. The proposed method achieves state-of-the-art accuracy on three public benchmarks, demonstrating the effectiveness of our method in cross-subject EEG MI classification. Our findings highlight the potential of leveraging RS EEG signals to advance practical brain-computer interface systems.
- [7] arXiv:2405.19347 [pdf, ps, html, other]
-
Title: Near-Field Spot Beamfocusing: A Correlation-Aware Transfer Learning ApproachSubjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
3D spot beamfocusing (SBF), in contrast to conventional angular-domain beamforming, concentrates radiating power within very small volume in both radial and angular domains in the near-field zone. Recently the implementation of channel-state-information (CSI)-independent machine learning (ML)-based approaches have been developed for effective SBF using extremely-largescale-programable-metasurface (ELPMs). These methods involve dividing the ELPMs into subarrays and independently training them with Deep Reinforcement Learning to jointly focus the beam at the Desired Focal Point (DFP). This paper explores near-field SBF using ELPMs, addressing challenges associated with lengthy training times resulting from independent training of subarrays. To achieve a faster CSIindependent solution, inspired by the correlation between the beamfocusing matrices of the subarrays, we leverage transfer learning techniques. First, we introduce a novel similarity criterion based on the Phase Distribution Image of subarray apertures. Then we devise a subarray policy propagation scheme that transfers the knowledge from trained to untrained subarrays. We further enhance learning by introducing Quasi-Liquid-Layers as a revised version of the adaptive policy reuse technique. We show through simulations that the proposed scheme improves the training speed about 5 times. Furthermore, for dynamic DFP management, we devised a DFP policy blending process, which augments the convergence rate up to 8-fold.
- [8] arXiv:2405.19348 [pdf, ps, other]
-
Title: NERULA: A Dual-Pathway Self-Supervised Learning Framework for Electrocardiogram Signal AnalysisComments: Paper in reviewSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
Electrocardiogram (ECG) signals are critical for diagnosing heart conditions and capturing detailed cardiac patterns. As wearable single-lead ECG devices become more common, efficient analysis methods are essential. We present NERULA (Non-contrastive ECG and Reconstruction Unsupervised Learning Algorithm), a self-supervised framework designed for single-lead ECG signals. NERULA's dual-pathway architecture combines ECG reconstruction and non-contrastive learning to extract detailed cardiac features. Our 50% masking strategy, using both masked and inverse-masked signals, enhances model robustness against real-world incomplete or corrupted data. The non-contrastive pathway aligns representations of masked and inverse-masked signals, while the reconstruction pathway comprehends and reconstructs missing features. We show that combining generative and discriminative paths into the training spectrum leads to better results by outperforming state-of-the-art self-supervised learning benchmarks in various tasks, demonstrating superior performance in ECG analysis, including arrhythmia classification, gender classification, age regression, and human activity recognition. NERULA's dual-pathway design offers a robust, efficient solution for comprehensive ECG signal interpretation.
- [9] arXiv:2405.19349 [pdf, ps, other]
-
Title: Beyond Isolated Frames: Enhancing Sensor-Based Human Activity Recognition through Intra- and Inter-Frame AttentionSubjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
Human Activity Recognition (HAR) has become increasingly popular with ubiquitous computing, driven by the popularity of wearable sensors in fields like healthcare and sports. While Convolutional Neural Networks (ConvNets) have significantly contributed to HAR, they often adopt a frame-by-frame analysis, concentrating on individual frames and potentially overlooking the broader temporal dynamics inherent in human activities. To address this, we propose the intra- and inter-frame attention model. This model captures both the nuances within individual frames and the broader contextual relationships across multiple frames, offering a comprehensive perspective on sequential data. We further enrich the temporal understanding by proposing a novel time-sequential batch learning strategy. This learning strategy preserves the chronological sequence of time-series data within each batch, ensuring the continuity and integrity of temporal patterns in sensor-based HAR.
- [10] arXiv:2405.19351 [pdf, ps, other]
-
Title: Resonate-and-Fire Spiking Neurons for Target Detection and Hand Gesture Recognition: A Hybrid ApproachSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Hand gesture recognition using radar often relies on computationally expensive fast Fourier transforms. This paper proposes an alternative approach that bypasses fast Fourier transforms using resonate-and-fire neurons. These neurons directly detect the hand in the time-domain signal, eliminating the need for fast Fourier transforms to retrieve range information. Following detection, a simple Goertzel algorithm is employed to extract five key features, eliminating the need for a second fast Fourier transform. These features are then fed into a recurrent neural network, achieving an accuracy of 98.21% for classifying five gestures. The proposed approach demonstrates competitive performance with reduced complexity compared to traditional methods
- [11] arXiv:2405.19356 [pdf, ps, html, other]
-
Title: An LSTM Feature Imitation Network for Hand Movement Recognition from sEMG SignalsComments: This work has been submitted to RA-L, and under reviewSubjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
Surface Electromyography (sEMG) is a non-invasive signal that is used in the recognition of hand movement patterns, the diagnosis of diseases, and the robust control of prostheses. Despite the remarkable success of recent end-to-end Deep Learning approaches, they are still limited by the need for large amounts of labeled data. To alleviate the requirement for big data, researchers utilize Feature Engineering, which involves decomposing the sEMG signal into several spatial, temporal, and frequency features. In this paper, we propose utilizing a feature-imitating network (FIN) for closed-form temporal feature learning over a 300ms signal window on Ninapro DB2, and applying it to the task of 17 hand movement recognition. We implement a lightweight LSTM-FIN network to imitate four standard temporal features (entropy, root mean square, variance, simple square integral). We then explore transfer learning capabilities by applying the pre-trained LSTM-FIN for tuning to a downstream hand movement recognition task. We observed that the LSTM network can achieve up to 99\% R2 accuracy in feature reconstruction and 80\% accuracy in hand movement recognition. Our results also showed that the model can be robustly applied for both within- and cross-subject movement recognition, as well as simulated low-latency environments. Overall, our work demonstrates the potential of the FIN modeling paradigm in data-scarce scenarios for sEMG signal processing.
- [12] arXiv:2405.19359 [pdf, ps, html, other]
-
Title: Modally Reduced Representation Learning of Multi-Lead ECG Signals through Simultaneous Alignment and ReconstructionComments: Accepted as a Workshop Paper at TS4H@ICLR2024Journal-ref: ICLR 2024 Workshop on Learning from Time Series For HealthSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
Electrocardiogram (ECG) signals, profiling the electrical activities of the heart, are used for a plethora of diagnostic applications. However, ECG systems require multiple leads or channels of signals to capture the complete view of the cardiac system, which limits their application in smartwatches and wearables. In this work, we propose a modally reduced representation learning method for ECG signals that is capable of generating channel-agnostic, unified representations for ECG signals. Through joint optimization of reconstruction and alignment, we ensure that the embeddings of the different channels contain an amalgamation of the overall information across channels while also retaining their specific information. On an independent test dataset, we generated highly correlated channel embeddings from different ECG channels, leading to a moderate approximation of the 12-lead signals from a single-channel embedding. Our generated embeddings can work as competent features for ECG signals for downstream tasks.
- [13] arXiv:2405.19363 [pdf, ps, html, other]
-
Title: Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series ClassificationComments: 20pages (14 pages main paper + 6 pages supplementary materials)Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Medical time series data, such as Electroencephalography (EEG) and Electrocardiography (ECG), play a crucial role in healthcare, such as diagnosing brain and heart diseases. Existing methods for medical time series classification primarily rely on handcrafted biomarkers extraction and CNN-based models, with limited exploration of transformers tailored for medical time series. In this paper, we introduce Medformer, a multi-granularity patching transformer tailored specifically for medical time series classification. Our method incorporates three novel mechanisms to leverage the unique characteristics of medical time series: cross-channel patching to leverage inter-channel correlations, multi-granularity embedding for capturing features at different scales, and two-stage (intra- and inter-granularity) multi-granularity self-attention for learning features and correlations within and among granularities. We conduct extensive experiments on five public datasets under both subject-dependent and challenging subject-independent setups. Results demonstrate Medformer's superiority over 10 baselines, achieving top averaged ranking across five datasets on all six evaluation metrics. These findings underscore the significant impact of our method on healthcare applications, such as diagnosing Myocardial Infarction, Alzheimer's, and Parkinson's disease. We release the source code at \url{this https URL}.
- [14] arXiv:2405.19366 [pdf, ps, html, other]
-
Title: ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-Enhanced Cardiological TextSubjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI)
The utilization of deep learning on electrocardiogram (ECG) analysis has brought the advanced accuracy and efficiency of cardiac healthcare diagnostics. By leveraging the capabilities of deep learning in semantic understanding, especially in feature extraction and representation learning, this study introduces a new multimodal contrastive pretaining framework that aims to improve the quality and robustness of learned representations of 12-lead ECG signals. Our framework comprises two key components, including Cardio Query Assistant (CQA) and ECG Semantics Integrator(ESI). CQA integrates a retrieval-augmented generation (RAG) pipeline to leverage large language models (LLMs) and external medical knowledge to generate detailed textual descriptions of ECGs. The generated text is enriched with information about demographics and waveform patterns. ESI integrates both contrastive and captioning loss to pretrain ECG encoders for enhanced representations. We validate our approach through various downstream tasks, including arrhythmia detection and ECG-based subject identification. Our experimental results demonstrate substantial improvements over strong baselines in these tasks. These baselines encompass supervised and self-supervised learning methods, as well as prior multimodal pretraining approaches.
- [15] arXiv:2405.19373 [pdf, ps, html, other]
-
Title: Multi-modal Mood Reader: Pre-trained Model Empowers Cross-Subject Emotion RecognitionComments: Accepted by International Conference on Neural Computing for Advanced Applications, 2024Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
Emotion recognition based on Electroencephalography (EEG) has gained significant attention and diversified development in fields such as neural signal processing and affective computing. However, the unique brain anatomy of individuals leads to non-negligible natural differences in EEG signals across subjects, posing challenges for cross-subject emotion recognition. While recent studies have attempted to address these issues, they still face limitations in practical effectiveness and model framework unity. Current methods often struggle to capture the complex spatial-temporal dynamics of EEG signals and fail to effectively integrate multimodal information, resulting in suboptimal performance and limited generalizability across subjects. To overcome these limitations, we develop a Pre-trained model based Multimodal Mood Reader for cross-subject emotion recognition that utilizes masked brain signal modeling and interlinked spatial-temporal attention mechanism. The model learns universal latent representations of EEG signals through pre-training on large scale dataset, and employs Interlinked spatial-temporal attention mechanism to process Differential Entropy(DE) features extracted from EEG data. Subsequently, a multi-level fusion layer is proposed to integrate the discriminative features, maximizing the advantages of features across different dimensions and modalities. Extensive experiments on public datasets demonstrate Mood Reader's superior performance in cross-subject emotion recognition tasks, outperforming state-of-the-art methods. Additionally, the model is dissected from attention perspective, providing qualitative analysis of emotion-related brain areas, offering valuable insights for affective research in neural signal processing.
- [16] arXiv:2405.19481 [pdf, ps, html, other]
-
Title: Integrated Communication and Imaging: Design, Analysis, and Performances of COSMIC WaveformsSubjects: Signal Processing (eess.SP)
This paper proposes a novel waveform design method named COSMIC (Connectivity-Oriented Sensing Method for Imaging and Communication). These waveforms are engineered to convey communication symbols while adhering to an extended orthogonality condition, enabling their use in generating radio images of the environment. A Multiple-Input Multiple-Output (MIMO) Radar-Communication (RadCom) device transmits COSMIC waveforms from each antenna simultaneously within the same time window and frequency band, indicating that orthogonality is not achieved by space, time, or frequency multiplexing. Indeed, orthogonality among the waveforms is achieved by leveraging the degrees of freedom provided by the assumption that the field of view is limited or significantly smaller than the transmitted signals' length. The RadCom device receives and processes the echoes from an infinite number of infinitesimal scatterers within its field of view, constructing an electromagnetic image of the environment. Concurrently, these waveforms can also carry information to other connected network entities. This work provides the algebraic concepts used to generate COSMIC waveforms. Moreover, an opportunistic optimization of the imaging and communication efficiency is discussed. Simulation results demonstrate that COSMIC waveforms enable accurate environmental imaging while maintaining acceptable communication performances.
- [17] arXiv:2405.19489 [pdf, ps, other]
-
Title: Optimising RF linear Amplifier for maximum efficiency and linearitySubjects: Signal Processing (eess.SP)
A method for increasing efficiency of radio frequency (RF) amplifier employing laterally diffused metal oxide semiconductor (LDMOS) transistors coupled to an RF exciter depending on the emission mode of modulated RF input signals generated by exciter, if exciter output signal is of a type where modulated RF signals do not have continuously varying envelope, biasing the LDMOS transistor in the RF amplifier with fixed quiescent drain current and fixed drain voltage supply to cause LDMOS transistors to operate in compression and if exciter output signal is of a type where modulated RF signals do have continuously varying envelope, biasing the LDMOS transistors in the RF amplifier for linear operation.
- [18] arXiv:2405.19516 [pdf, ps, html, other]
-
Title: Enabling Visual Recognition at Radio FrequencySubjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
This paper introduces PanoRadar, a novel RF imaging system that brings RF resolution close to that of LiDAR, while providing resilience against conditions challenging for optical signals. Our LiDAR-comparable 3D imaging results enable, for the first time, a variety of visual recognition tasks at radio frequency, including surface normal estimation, semantic segmentation, and object detection. PanoRadar utilizes a rotating single-chip mmWave radar, along with a combination of novel signal processing and machine learning algorithms, to create high-resolution 3D images of the surroundings. Our system accurately estimates robot motion, allowing for coherent imaging through a dense grid of synthetic antennas. It also exploits the high azimuth resolution to enhance elevation resolution using learning-based methods. Furthermore, PanoRadar tackles 3D learning via 2D convolutions and addresses challenges due to the unique characteristics of RF signals. Our results demonstrate PanoRadar's robust performance across 12 buildings.
- [19] arXiv:2405.19542 [pdf, ps, html, other]
-
Title: Anatomical Region Recognition and Real-time Bone Tracking Methods by Dynamically Decoding A-Mode Ultrasound SignalsSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Robotics (cs.RO)
Accurate bone tracking is crucial for kinematic analysis in orthopedic surgery and prosthetic robotics. Traditional methods (e.g., skin markers) are subject to soft tissue artifacts, and the bone pins used in surgery introduce the risk of additional trauma and infection. For electromyography (EMG), its inability to directly measure joint angles requires complex algorithms for kinematic estimation. To address these issues, A-mode ultrasound-based tracking has been proposed as a non-invasive and safe alternative. However, this approach suffers from limited accuracy in peak detection when processing received ultrasound signals. To build a precise and real-time bone tracking approach, this paper introduces a deep learning-based method for anatomical region recognition and bone tracking using A-mode ultrasound signals, specifically focused on the knee joint. The algorithm is capable of simultaneously performing bone tracking and identifying the anatomical region where the A-mode ultrasound transducer is placed. It contains the fully connection between all encoding and decoding layers of the cascaded U-Nets to focus only on the signal region that is most likely to have the bone peak, thus pinpointing the exact location of the peak and classifying the anatomical region of the signal. The experiment showed a 97% accuracy in the classification of the anatomical regions and a precision of around 0.5$\pm$1mm under dynamic tracking conditions for various anatomical areas surrounding the knee joint. In general, this approach shows great potential beyond the traditional method, in terms of the accuracy achieved and the recognition of the anatomical region where the ultrasound has been attached as an additional functionality.
- [20] arXiv:2405.19639 [pdf, ps, other]
-
Title: Generalized BER Performance Analysis for SIC-based Uplink NOMA SystemsSubjects: Signal Processing (eess.SP)
Non-orthogonal multiple access (NOMA) is widely recognized for its spectral and energy efficiency, which allows more users to share the network resources more effectively. This paper provides a generalized bit error rate (BER) performance analysis of successive interference cancellation (SIC)-based uplink NOMA systems under Rayleigh fading channels, taking into account error propagation resulting from SIC imperfections. Exact closed-form BER expressions are initially derived for scenarios with 2 and 3 users using quadrature phase shift keying (QPSK) modulation. These expressions are then generalized to encompass any arbitrary rectangular/square M-ary quadrature amplitude modulation (M-QAM) order, number of NOMA users, and number of BS antennas. Additionally, by utilizing the derived closed-form BER expressions, a simple and practically feasible power allocation (PA) technique is devised to minimize the sum bit error rate of the users and optimize the SIC-based NOMA detection at the base-station (BS). The derived closed-form expressions are corroborated through Monte Carlo simulations. It is demonstrated that these expressions can be effective for optimal uplink PA to ensure optimized SIC detection that mitigates error floors. It is also shown that significant performance improvements are achieved regardless of the users' decoding order, making uplink SIC-based NOMA a viable approach.
- [21] arXiv:2405.19858 [pdf, ps, html, other]
-
Title: Position Error Bound for Cooperative Sensing in MIMO-OFDM NetworksComments: 6 pagesSubjects: Signal Processing (eess.SP)
Only the chairs can edit This paper investigates the fundamental limits of target position estimation accuracy of joint sensing and communication (JSC) networks comprising several monostatic base stations (BSs) that cooperate to localize targets. Specifically, each BS adopts a multiple-input multiple-output (MIMO)-orthogonal frequency division multiplexing (OFDM) scheme with a multi-beam radiation pattern to partition power between communication and sensing tasks. Building on prior works, we derive a general framework to evaluate the positioning accuracy of a target in networks with an arbitrary number of cooperating BSs and arbitrary geometrical configurations using Fisher information. Numerical results demonstrate the benefits of cooperation between BSs in improving target localization accuracy and provide insights into the relationships between various system parameters, which may aid in designing JSC networks.
- [22] arXiv:2405.19889 [pdf, ps, html, other]
-
Title: Deep Joint Semantic Coding and Beamforming for Near-Space Airship-Borne Massive MIMO NetworkComments: Major Revision by IEEE JSACSubjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG); Multimedia (cs.MM)
Near-space airship-borne communication network is recognized to be an indispensable component of the future integrated ground-air-space network thanks to airships' advantage of long-term residency at stratospheric altitudes, but it urgently needs reliable and efficient Airship-to-X link. To improve the transmission efficiency and capacity, this paper proposes to integrate semantic communication with massive multiple-input multiple-output (MIMO) technology. Specifically, we propose a deep joint semantic coding and beamforming (JSCBF) scheme for airship-based massive MIMO image transmission network in space, in which semantics from both source and channel are fused to jointly design the semantic coding and physical layer beamforming. First, we design two semantic extraction networks to extract semantics from image source and channel state information, respectively. Then, we propose a semantic fusion network that can fuse these semantics into complex-valued semantic features for subsequent physical-layer transmission. To efficiently transmit the fused semantic features at the physical layer, we then propose the hybrid data and model-driven semantic-aware beamforming networks. At the receiver, a semantic decoding network is designed to reconstruct the transmitted images. Finally, we perform end-to-end deep learning to jointly train all the modules, using the image reconstruction quality at the receivers as a metric. The proposed deep JSCBF scheme fully combines the efficient source compressibility and robust error correction capability of semantic communication with the high spectral efficiency of massive MIMO, achieving a significant performance improvement over existing approaches.
- [23] arXiv:2405.19925 [pdf, ps, html, other]
-
Title: Integrated Sensing and Communications Framework for 6G NetworksHongliang Luo, Tengyu Zhang, Chuanbin Zhao, Yucong Wang, Bo Lin, Yuhua Jiang, Dongqi Luo, Feifei GaoSubjects: Signal Processing (eess.SP)
In this paper, we propose a novel integrated sensing and communications (ISAC) framework for the sixth generation (6G) mobile networks, in which we decompose the real physical world into static environment, dynamic targets, and various object materials. The ubiquitous static environment occupies the vast majority of the physical world, for which we design static environment reconstruction (SER) scheme to obtain the layout and point cloud information of static buildings. The dynamic targets floating in static environments create the spatiotemporal transition of the physical world, for which we design comprehensive dynamic target sensing (DTS) scheme to detect, estimate, track, image and recognize the dynamic targets in real-time. The object materials enrich the electromagnetic laws of the physical world, for which we develop object material recognition (OMR) scheme to estimate the electromagnetic coefficient of the objects. Besides, to integrate these sensing functions into existing communications systems, we discuss the interference issues and corresponding solutions for ISAC cellular networks. Furthermore, we develop an ISAC hardware prototype platform that can reconstruct the environmental maps and sense the dynamic targets while maintaining communications services. With all these designs, the proposed ISAC framework can support multifarious emerging applications, such as digital twins, low altitude economy, internet of vehicles, marine management, deformation monitoring, etc.
- [24] arXiv:2405.20052 [pdf, ps, html, other]
-
Title: A Hardware-Efficient EMG Decoder with an Attractor-based Neural Network for Next-Generation Hand ProsthesesMohammad Kalbasi, MohammadAli Shaeri, Vincent Alexandre Mendez, Solaiman Shokur, Silvestro Micera, Mahsa ShoaranComments: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
Advancements in neural engineering have enabled the development of Robotic Prosthetic Hands (RPHs) aimed at restoring hand functionality. Current commercial RPHs offer limited control through basic on/off commands. Recent progresses in machine learning enable finger movement decoding with higher degrees of freedom, yet the high computational complexity of such models limits their application in portable devices. Future RPH designs must balance portability, low power consumption, and high decoding accuracy to be practical for individuals with disabilities. To this end, we introduce a novel attractor-based neural network to realize on-chip movement decoding for next-generation portable RPHs. The proposed architecture comprises an encoder, an attention layer, an attractor network, and a refinement regressor. We tested our model on four healthy subjects and achieved a decoding accuracy of 80.6\pm3.3\%. Our proposed model is over 120 and 50 times more compact compared to state-of-the-art LSTM and CNN models, respectively, with comparable (or superior) decoding accuracy. Therefore, it exhibits minimal hardware complexity and can be effectively integrated as a System-on-Chip.
- [25] arXiv:2405.20068 [pdf, ps, html, other]
-
Title: An Efficient Network with Novel Quantization Designed for Massive MIMO CSI FeedbackSubjects: Signal Processing (eess.SP)
The efficacy of massive multiple-input multiple-output (MIMO) techniques heavily relies on the accuracy of channel state information (CSI) in frequency division duplexing (FDD) systems. Many works focus on CSI compression and quantization methods to enhance CSI reconstruction accuracy with lower feedback overhead. In this letter, we propose CsiConformer, a novel CSI feedback network that combines convolutional operations and self-attention mechanisms to improve CSI feedback accuracy. Additionally, a new quantization module is developed to improve encoding efficiency. Experiment results show that CsiConformer outperforms previous state-of-the-art networks, achieving an average accuracy improvement of 17.67\% with lower computational overhead.
- [26] arXiv:2405.20107 [pdf, ps, html, other]
-
Title: A Perspective on the Impact of Group Delay Dispersion in Future Terahertz Wireless SystemsComments: 7 pages, 4 figures, 2 tables. This work has been submitted to the IEEE for possible publicationSubjects: Signal Processing (eess.SP)
This article discusses the challenges and opportunities of managing group delay dispersion (GDD) and its relation to the performance standards of future sixth-generation (6G) wireless communication systems utilizing terahertz frequency waves. The unique susceptibilities of 6G systems to GDD are described, along with a quantitative description of the sources of GDD, including multipath, rough surface scattering, intelligent reflecting surfaces, and propagation through the atmosphere. An experimental case-study is presented that confirms previous models quantifying the impact of atmospheric GDD. Several GDD manipulation strategies are presented illustrating their hindered effectiveness in the 6G context. Conversely, some benefits of leveraging GDD to enhance 6G systems, such as improved security and simplified hardware, are also discussed. Finally, a perspective on using photonic GDD control devices is provided, revealing quantitative benefits that may unburden existing equalization schemes. The article argues that GDD will uniquely and significantly impact some 6G systems, but that its careful consideration along with new mitigation strategies, including photonic devices, will help optimize system performance. The conclusion provides a perspective to guide future research in this area.
- [27] arXiv:2405.20122 [pdf, ps, html, other]
-
Title: Distributed MIMO Precoding with Routing Constraints in Segmented FronthaulComments: This is the accepted version of a paper published in 2023 IEEE 34th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). The final version is available at this https URLJournal-ref: PIMRC, Toronto, ON, Canada, 2023, pp. 1-6Subjects: Signal Processing (eess.SP)
Distributed Multiple-Input and Multiple-Output (D-MIMO) is envisioned to play a significant role in future wireless communication systems as an effective means to improve coverage and capacity. In this paper, we have studied the impact of a practical two-level data routing scheme on radio performance in a downlink D-MIMO scenario with segmented fronthaul. At the first level, a Distributed Unit (DU) is connected to the Aggregating Radio Units (ARUs) that behave as cluster heads for the selected serving RU groups. At the second level, the selected ARUs connect with the additional serving RUs. At each route discovery level, RUs and/or ARUs share information with each other. The aim of the proposed framework is to efficiently select serving RUs and ARUs so that the practical data routing impact for each User Equipment (UE) connection is minimal. The resulting post-routing Signal-to-Interference plus Noise Ratio (SINR) among all UEs is analyzed after the routing constraints have been applied. The results show that limited fronthaul segment capacity causes connection failures with the serving RUs of individual UEs, especially when long routing path lengths are required. Depending on whether the failures occur at the first or the second routing level, a UE may be dropped or its SINR may be reduced. To minimize the DU-ARU connection failures, the segment capacity of the segments closest to the DU is set as double as the remaining segments. When the number of active co-scheduled UEs is kept low enough, practical segment capacities suffice to achieve a zero UE dropping rate. Besides, the proper choice of maximum path length setting should take into account segment capacity and its utilization due to the relation between the two.
- [28] arXiv:2405.20157 [pdf, ps, other]
-
Title: A Multiband T-Shaped Antenna Array for 6G Mobile CommunicationSunday Achimugu, Abraham Usman Usman, Suleiman Zubair, Michael David, Abdulkadir Olayinka Abdulbaki, Hassan Musa AbdullahiSubjects: Signal Processing (eess.SP)
The paradigm shift in the use cases of wireless communication necessitates the need to move toward higher data rates, large bandwidths, and intelligent reconfiguration in 6G. This paper presents a novel double T-shaped antenna array that operates between 4GHz to 16GHz for 6G mobile communication. The antenna consists of a rectangular microstrip with a fractal Tshaped slot, cut at the rear of the microstrip to provide an air gap for an improved radiation pattern.
New submissions for Friday, 31 May 2024 (showing 28 of 28 entries )
- [29] arXiv:2405.19771 (cross-list from cs.NI) [pdf, ps, html, other]
-
Title: Data Service Maximization in Integrated Terrestrial-Non-Terrestrial 6G Networks: A Deep Reinforcement Learning ApproachComments: 5 pages, 4 figuresSubjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
Integrating terrestrial and non-terrestrial networks has emerged as a promising paradigm to fulfill the constantly growing demand for connectivity, low transmission delay, and quality of services (QoS). This integration brings together the strengths of terrestrial and non-terrestrial networks, such as the reliability of terrestrial networks, broad coverage, and service continuity of non-terrestrial networks like low earth orbit (LEO) satellites. In this work, we study a data service maximization problem in an integrated terrestrial-non-terrestrial network (I-TNT) where the ground base stations (GBSs) and LEO satellites cooperatively serve the coexisting aerial users (AUs) and ground users (GUs). Then, by considering the spectrum scarcity, interference, and QoS requirements of the users, we jointly optimize the user association, AUE's trajectory, and power allocation. To tackle the formulated mixed-integer non-convex problem, we disintegrate it into two subproblems: 1) user association problem and 2) trajectory and power allocation problem. Since the user association problem is a binary integer programming problem, we use the standard convex optimization method to solve it. Meanwhile, the trajectory and power allocation problem is solved by the deep deterministic policy gradient (DDPG) method to cope with the problem's non-convexity and dynamic network environments. Then, the two subproblems are alternately solved by the proposed iterative algorithm. By comparing with the baselines in the existing literature, extensive simulations are conducted to evaluate the performance of the proposed framework.
- [30] arXiv:2405.20073 (cross-list from cs.IT) [pdf, ps, html, other]
-
Title: Power Allocation for Cell-Free Massive MIMO ISAC Systems with OTFS SignalComments: This work is submitted to IEEE for possible publicationSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Applying integrated sensing and communication (ISAC) to a cell-free massive multiple-input multiple-output (CF mMIMO) architecture has attracted increasing attention. This approach equips CF mMIMO networks with sensing capabilities and resolves the problem of unreliable service at cell edges in conventional cellular networks. However, existing studies on CF-ISAC systems have focused on the application of traditional integrated signals. To address this limitation, this study explores the employment of the orthogonal time frequency space (OTFS) signal as a representative of innovative signals in the CF-ISAC system, and the system's overall performance is optimized and evaluated. A universal downlink spectral efficiency (SE) expression is derived regarding multi-antenna access points (APs) and optional sensing beams. To streamline the analysis and optimization of the CF-ISAC system with the OTFS signal, we introduce a lower bound on the achievable SE that is applicable to OTFS-signal-based systems. Based on this, a power allocation algorithm is proposed to maximize the minimum communication signal-to-interference-plus-noise ratio (SINR) of users while guaranteeing a specified sensing SINR value and meeting the per-AP power constraints. The results demonstrate the tightness of the proposed lower bound and the efficiency of the proposed algorithm. Finally, the superiority of using the OTFS signals is verified by a 13-fold expansion of the SE performance gap over the application of orthogonal frequency division multiplexing signals. These findings could guide the future deployment of the CF-ISAC systems, particularly in the field of millimeter waves with a large bandwidth.
Cross submissions for Friday, 31 May 2024 (showing 2 of 2 entries )
- [31] arXiv:2305.13910 (replaced) [pdf, ps, html, other]
-
Title: Experimental Assessment of Misalignment Effects in Terahertz CommunicationsComments: 6 pages, 6 figures, conference paperSubjects: Signal Processing (eess.SP)
Terahertz (THz) frequencies are important for next generation wireless systems due to the advantages in terms of large available bandwidths. On the other hand, the limited range due to high attenuation in these frequencies can be overcome via densely installed heterogeneous networks also utilizing UAVs in a three-dimensional hyperspace. Yet, THz communications rely on precise beam alignment, if not handled properly results in low signal strength at the receiver which impacts THz signals more than conventional ones. This work focuses on the importance of precise alignment in THz communication systems and the significant effect of proper alignment is validated through comprehensive measurements conducted through a state-of-the-art measurement setup, which enables accurate data collection between 240 GHz to 300 GHz at varying angles and distances in an anechoic chamber eliminating reflections. By analyzing the channel frequency and impulse responses of these extensive and particular measurements, this study provides the first quantifiable results in terms of measuring the effects of beam misalignment in THz frequencies.
- [32] arXiv:2309.07131 (replaced) [pdf, ps, html, other]
-
Title: Wideband High Gain Metasurface-Based 4T4R MIMO antenna with Highly Isolated Ports for Sub-6 GHz 5G ApplicationsComments: 20 pages, 15 figures, and 3 TablesSubjects: Signal Processing (eess.SP)
This study presents the design of four $178\times178$ $(mm)^{2}$ wideband, high gain, highly efficient metasurface-based 4T4R MIMO (Multiple-Input Multiple-Output) antennas with highly isolated ports, covering the middle and a portion of the upper bands of the sub 6 GHz 5G frequency spectrum for 5G-based systems, such as IoT (Internet of Things) applications, vehicular communications (e.g., rooftop antennas of cars or trains), smart industries (e.g., farms and factories). The radiating elements of these antennas use the aperture-coupled feeding technique with a dumbbell-shaped slot, a truncated square patch with two U-shaped slots, and a metasurface layer. The proposed MIMO structures place four identical radiating elements like a $2\times2$ matrix with $90^\circ$ successive rotations to produce orthogonal electromagnetic waves, improving the isolation between ports. Six-millimeter spaces are added between these elements, and two vertical and horizontal strip slots are carved on the ground as the decoupling structure to decrease the mutual coupling. Simulation results show that Antenna\_{1}, Antenna\_{2}, and Antenna\_{3} achieve gain values of 6.2 to 9.4 dBi, 8.2 to 11.6 dBi, 6.2 to 9.5 dBi, below -35, -25, and -33 isolation and almost 10 dB diversity gain from 2.8 to 4.7 GHz, 2.8 to 4.5 GHz, and 2.7 to 4.9 GHz, respectively. As a prototype, Antenna\_{4} is manufactured, and measurements are performed. It achieves 6.28 to 10.45 dBi gain values, below -23 dB isolation, and 0.001 envelope correlation coefficient over 2.7 to 4.3 GHz. The results confirm that the proposed MIMO antennas are compatible with the 5G essential requisites.
- [33] arXiv:2402.04395 (replaced) [pdf, ps, other]
-
Title: Auto-Encoder Optimized PAM IM/DD Transceivers for Amplified Fiber LinksAmir Omidi, Mai Banawan, Erwan Weckenmann, Benoit Paquin, Alireza Geravand, Zibo Zheng, Wei Shi, Ming Zeng, Leslie A. RuschComments: 9 pages and 13 figuresSubjects: Signal Processing (eess.SP)
We examine pulse amplitude modulation (PAM) for intensity modulation and direct detection systems. Using a straight-forward, mixed noise model, we optimize the constellations with an autoencoder-based neural network (NN), an improve required signal-to-noise ratio of 4 dB for amplified spontaneous emission (ASE)-limited PAM4 and PAM8, without increasing system complexity. Performance can also be improved in O-band wavelength division multiplexing system with semiconductor optical amplifier amplification and chromatic dispersion. We show via simulation that for such a system operating at 53 Gbaud, we can extend the reach of PAM4 by 10-25 km with an optimized constellation and a NN decoder. We present an experimental validation of 4 dB improvement of an ASE-limited PAM4 at 60 Gbaud using an optimized constellation and a NN decoder.
- [34] arXiv:2403.02565 (replaced) [pdf, ps, html, other]
-
Title: Deep Cooperation in ISAC System: Resource, Node and Infrastructure PerspectivesComments: 8 pages and 6 figures, Accepted by IEEE Internet of Things MagazineSubjects: Signal Processing (eess.SP)
With the emerging Integrated Sensing and Communication (ISAC) technique, exploiting the mobile communication system with multi-domain resources, multiple network elements, and large-scale infrastructures to realize cooperative sensing is a crucial approach satisfying the requirements of high-accuracy and large-scale sensing in IoE. In this article, the deep cooperation in ISAC system including three perspectives is investigated. In the microscopic perspective, namely, within a single node, the sensing information carried by time-frequency-space-code domain resources is processed, such as phase compensation, coherent accumulation and other operations, thereby improving the sensing accuracy. In the mesoscopic perspective, the sensing accuracy could be improved through the cooperation of multiple nodes. We explore various multi-node cooperative sensing scenarios and present the corresponding challenges and future research trends. In the macroscopic perspective, the massive number of infrastructures from the same operator or different operators could perform cooperative sensing to extend the sensing coverage and improve the sensing continuity. We investigate network architecture, target tracking methods, and the large-scale sensing assisted digital twin construction. Simulation results demonstrate the superiority of multi-nodes and multi-resources cooperative sensing over single resource or node sensing. This article may provide a deep and comprehensive view on the cooperative sensing in ISAC system to enhance the performance of sensing, supporting the applications of IoE.
- [35] arXiv:2405.14472 (replaced) [pdf, ps, html, other]
-
Title: SolNet: Open-source deep learning models for photovoltaic power forecasting across the globeComments: 24 pages, 5 figuresSubjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
Deep learning models have gained increasing prominence in recent years in the field of solar pho-tovoltaic (PV) forecasting. One drawback of these models is that they require a lot of high-quality data to perform well. This is often infeasible in practice, due to poor measurement infrastructure in legacy systems and the rapid build-up of new solar systems across the world. This paper proposes SolNet: a novel, general-purpose, multivariate solar power forecaster, which addresses these challenges by using a two-step forecasting pipeline which incorporates transfer learning from abundant synthetic data generated from PVGIS, before fine-tuning on observational data. Using actual production data from hundreds of sites in the Netherlands, Australia and Belgium, we show that SolNet improves forecasting performance over data-scarce settings as well as baseline models. We find transfer learning benefits to be the strongest when only limited observational data is available. At the same time we provide several guidelines and considerations for transfer learning practitioners, as our results show that weather data, seasonal patterns, amount of synthetic data and possible mis-specification in source location, can have a major impact on the results. The SolNet models created in this way are applicable for any land-based solar photovoltaic system across the planet where simulated and observed data can be combined to obtain improved forecasting capabilities.
- [36] arXiv:2405.18775 (replaced) [pdf, ps, html, other]
-
Title: Synchronization Scheme based on Pilot Sharing in Cell-Free Massive MIMO SystemsQihao Peng, Hong Ren, Zhendong Peng, Cunhua Pan, Maged Elkashlan, Dongming Wang, Jiangzhou Wang, Xiaohu YouComments: Submitted to IEEE Journal for posSubjects: Signal Processing (eess.SP)
This paper analyzes the impact of pilot-sharing scheme on synchronization performance in a scenario where several slave access points (APs) with uncertain carrier frequency offsets (CFOs) and timing offsets (TOs) share a common pilot sequence. First, the Cramer-Rao bound (CRB) with pilot contamination is derived for pilot-pairing estimation. Furthermore, a maximum likelihood algorithm is presented to estimate the CFO and TO among the pairing APs. Then, to minimize the sum of CRBs, we devise a synchronization strategy based on a pilot-sharing scheme by jointly optimizing the cluster classification, synchronization overhead, and pilot-sharing scheme, while simultaneously considering the overhead and each AP's synchronization requirements. To solve this NP-hard problem, we simplify it into two sub-problems, namely cluster classification problem and the pilot sharing problem. To strike a balance between synchronization performance and overhead, we first classify the clusters by using the K-means algorithm, and propose a criteria to find a good set of master APs. Then, the pilot-sharing scheme is obtained by using the swap-matching operations. Simulation results validate the accuracy of our derivations and demonstrate the effectiveness of the proposed scheme over the benchmark schemes.
- [37] arXiv:2206.01312 (replaced) [pdf, ps, other]
-
Title: Optimization of Energy-Constrained IRS-NOMA Using a Complex Circle Manifold ApproachSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
This work investigates the performance of intelligent reflective surfaces (IRSs) assisted uplink non-orthogonal multiple access (NOMA) in energy-constrained networks. Specifically, we formulate and solve two optimization problems; the first aims at minimizing the sum of users' transmit power, while the second targets maximizing the system level energy efficiency (EE). The two problems are solved by jointly optimizing the users' transmit powers and the beamforming coefficients at IRS subject to the users' individual uplink rate and transmit power constraints. A novel and low complexity algorithm is developed to optimize the IRS beamforming coefficients by optimizing the objective function over a \textit{complex circle manifold} (CCM). To efficiently optimize the IRS phase shifts over the manifold, the optimization problem is reformulated into a feasibility expansion problem which is reduced to a max-min signal-plus-interference-ratio (SINR). Then, with the aid of a smoothing technique, the exact penalty method is applied to transform the problem from constrained to unconstrained. The proposed solution is compared against three semi-definite programming (SDP)-based benchmarks which are semi-definite relaxation (SDR), SDP-difference of convex (SDP-DC) and sequential rank-one constraint relaxation (SROCR). The results show that the manifold algorithm provides better performance than the SDP-based benchmarks, and at a much lower computational complexity for both the transmit power minimization and EE maximization problems. The results also reveal that IRS-NOMA is only superior to orthogonal multiple access (OMA) when the users' target achievable rate requirements are relatively high.
- [38] arXiv:2305.15595 (replaced) [pdf, ps, html, other]
-
Title: Time-Varying Convex Optimization: A Contraction and Equilibrium Tracking ApproachSubjects: Optimization and Control (math.OC); Signal Processing (eess.SP); Systems and Control (eess.SY)
In this article, we provide a novel and broadly-applicable contraction-theoretic approach to continuous-time time-varying convex optimization. For any parameter-dependent contracting dynamics, we show that the tracking error is asymptotically proportional to the rate of change of the parameter with proportionality constant upper bounded by Lipschitz constant in which the parameter appears divided by the contraction rate of the dynamics squared. We additionally establish that any parameter-dependent contracting dynamics can be augmented with a feedforward prediction term to ensure that the tracking error converges to zero exponentially quickly. To apply these results to time-varying convex optimization problems, we establish the strong infinitesimal contractivity of dynamics solving three canonical problems, namely monotone inclusions, linear equality-constrained problems, and composite minimization problems. For each of these problems, we prove the sharpest-known rates of contraction and provide explicit tracking error bounds between solution trajectories and minimizing trajectories. We validate our theoretical results on three numerical examples including an application to control-barrier function based controller design.
- [39] arXiv:2306.10232 (replaced) [pdf, ps, other]
-
Title: Multi-Task Offloading via Graph Neural Networks in Heterogeneous Multi-access Edge ComputingComments: Insufficient completion, there are some errors in the current versionSubjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
In the rapidly evolving field of Heterogeneous Multi-access Edge Computing (HMEC), efficient task offloading plays a pivotal role in optimizing system throughput and resource utilization. However, existing task offloading methods often fall short of adequately modeling the dependency topology relationships between offloaded tasks, which limits their effectiveness in capturing the complex interdependencies of task features. To address this limitation, we propose a task offloading mechanism based on Graph Neural Networks (GNN). Our modeling approach takes into account factors such as task characteristics, network conditions, and available resources at the edge, and embeds these captured features into the graph structure. By utilizing GNNs, our mechanism can capture and analyze the intricate relationships between task features, enabling a more comprehensive understanding of the underlying dependency topology. Through extensive evaluations in heterogeneous networks, our proposed algorithm improves 18.6\%-53.8\% over greedy and approximate algorithms in optimizing system throughput and resource utilization. Our experiments showcase the advantage of considering the intricate interplay of task features using GNN-based modeling.
- [40] arXiv:2404.04870 (replaced) [pdf, ps, html, other]
-
Title: Signal-noise separation using unsupervised reservoir computingSubjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Chaotic Dynamics (nlin.CD)
Removing noise from a signal without knowing the characteristics of the noise is a challenging task. This paper introduces a signal-noise separation method based on time series prediction. We use Reservoir Computing (RC) to extract the maximum portion of "predictable information" from a given signal. Reproducing the deterministic component of the signal using RC, we estimate the noise distribution from the difference between the original signal and reconstructed one. The method is based on a machine learning approach and requires no prior knowledge of either the deterministic signal or the noise distribution. It provides a way to identify additivity/multiplicativity of noise and to estimate the signal-to-noise ratio (SNR) indirectly. The method works successfully for combinations of various signal and noise, including chaotic signal and highly oscillating sinusoidal signal which are corrupted by non-Gaussian additive/ multiplicative noise. The separation performances are robust and notably outstanding for signals with strong noise, even for those with negative SNR.
- [41] arXiv:2404.09385 (replaced) [pdf, ps, html, other]
-
Title: A Large-Scale Evaluation of Speech Foundation ModelsShu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi LeeComments: The extended journal version for SUPERB and SUPERB-SG. Published in IEEE/ACM TASLP. The Arxiv version is preferredSubjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Signal Processing (eess.SP)
The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work, we establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the paradigm for speech. We propose a unified multi-tasking framework to address speech processing tasks in SUPERB using a frozen foundation model followed by task-specialized, lightweight prediction heads. Combining our results with community submissions, we verify that the foundation model paradigm is promising for speech, and our multi-tasking framework is simple yet effective, as the best-performing foundation model shows competitive generalizability across most SUPERB tasks. For reproducibility and extensibility, we have developed a long-term maintained platform that enables deterministic benchmarking, allows for result sharing via an online leaderboard, and promotes collaboration through a community-driven benchmark database to support new development cycles. Finally, we conduct a series of analyses to offer an in-depth understanding of SUPERB and speech foundation models, including information flows across tasks inside the models, the correctness of the weighted-sum benchmarking protocol and the statistical significance and robustness of the benchmark.
- [42] arXiv:2405.16090 (replaced) [pdf, ps, html, other]
-
Title: EEG-DBNet: A Dual-Branch Network for Temporal-Spectral Decoding in Motor-Imagery Brain-Computer InterfacesSubjects: Human-Computer Interaction (cs.HC); Signal Processing (eess.SP)
Motor imagery electroencephalogram (EEG)-based brain-computer interfaces (BCIs) offer significant advantages for individuals with restricted limb mobility. However, challenges such as low signal-to-noise ratio and limited spatial resolution impede accurate feature extraction from EEG signals, thereby affecting the classification accuracy of different actions. To address these challenges, this study proposes an end-to-end dual-branch network (EEG-DBNet) that decodes the temporal and spectral sequences of EEG signals in parallel through two distinct network branches. Each branch comprises a local convolutional block and a global convolutional block. The local convolutional block transforms the source signal from the temporal-spatial domain to the temporal-spectral domain. By varying the number of filters and convolution kernel sizes, the local convolutional blocks in different branches adjust the length of their respective dimension sequences. Different types of pooling layers are then employed to emphasize the features of various dimension sequences, setting the stage for subsequent global feature extraction. The global convolution block splits and reconstructs the feature of the signal sequence processed by the local convolution block in the same branch and further extracts features through the dilated causal convolutional neural networks. Finally, the outputs from the two branches are concatenated, and signal classification is completed via a fully connected layer. Our proposed method achieves classification accuracies of 85.84% and 91.60% on the BCI Competition 4-2a and BCI Competition 4-2b datasets, respectively, surpassing existing state-of-the-art models. The source code is available at this https URL.
- [43] arXiv:2405.19228 (replaced) [pdf, ps, other]
-
Title: Motor Imagery Task Alters Dynamics of Human Body PostureSubjects: Neurons and Cognition (q-bio.NC); Signal Processing (eess.SP)
Motor Imagery (MI) is gaining traction in both rehabilitation and sports settings, but its immediate influence on human postural control is not yet clearly understood. The focus of this study is to examine the effects of MI on the dynamics of the Center of Pressure (COP), a crucial metric for evaluating postural stability. In the experiment, thirty healthy young adults participated in four different scenarios: normal standing with both open and closed eyes, and kinesthetic motor imagery focused on mediolateral (ML) and anteroposterior (AP) sway movements. A mathematical model was developed to characterize the nonlinear dynamics of the COP and to assess the impact of MI on these dynamics. Our results show a statistically significant increase (p-value<0.05) in variables such as COP path length and Long-Range Correlation (LRC) during MI compared to the closed-eye and normal standing conditions. These observations align well with psycho-neuromuscular theory, which suggests that imagining a specific movement activates neural pathways, consequently affecting postural control. This study presents compelling evidence that motor imagery not only has a quantifiable impact on COP dynamics but also that changes in the Center of Pressure (COP) are directionally consistent with the imagined movements. This finding holds significant implications for the field of rehabilitation science, suggesting that motor imagery could be strategically utilized to induce targeted postural adjustments. Nonetheless, additional research is required to fully understand the complex mechanisms that underlie this relationship and to corroborate these results across a more diverse set of populations.