Enhancing Sleep Stage Prediction with Breathing Sound Separation in Home Environments with Sleep Partners

Oct 20, 2023
World Sleep 2023


Current research on sleep stage prediction using smartphone audio data often overlooks the presence of sleep partners in home environments. Existing methods do not consider the overlap of the partner's breathing sound and the subject's breathing sound, which degrades the accuracy of sleep stage prediction. Our approach aims to improve the accuracy of predicting the sleep stage of a subject by effectively separating the breathing sound from that of the partner.

Materials and Methods:
We have developed a deep neural network model that separates a subject's breathing sound from a partner's. This model effectively isolates the subject's sound, and the segregated sounds are fed into our previously developed sound-based sleep stage model to obtain the results. We utilize the Sepformer, which is the state-of-the-art model for speech separation tasks, and train it by using our own datasets. We generated synthetic data by mixing breathing sound data from two people for training and testing. The separation model was trained by supervised learning with the pair of the mixed sound of two people and the sound of the subject which is the ground truth label of the separation results. We validated the model with synthetic data that have corresponding PSG results. We synthesized two-person data using sound recorded in hospital environment (N=10) with PSG and home environment (N=20) with level-2 PSG. The synthesized data (N=1000) was made by considering the diversity of distance and amplitude between subject and partner, of which 800 were used for training and 200 for testing.

For the 4-class sleep stage classification task, the proposed method, which incorporates the separation model, achieved a macro F1 score of 0.592 in evaluation using synthesized data that simulates situations where two individuals sleep together in bed. A performance improvement of 6.7% was observed when compared to the result without applying separation. We conducted a further analysis with synthetic datasets generated across diverse scenarios. In situations where the sound amplitudes from both individuals were equal, the utilization of the separation model resulted in a macro F1 score of 0.468, compared to 0.43 without it. When the subject's amplitude was doubled relative to the partner's, the score increased from 0.51 to 0.559. This trend continued, with scores rising from 0.6 to 0.626 for four times the amplitude and from 0.65 to 0.669 for eight times the amplitude.

To enhance the Sleep Stage (SS) prediction model in home environments with a sleep partner, we used a separation model. Integrating sound separation into the SS prediction model not only provides the convenience of sleep quality measurements at home over multiple nights but also emphasizes cost-effectiveness compared to the expensive polysomnography. This will facilitate the public understanding of sleep and improve the quality of care for those with sleep disorders. Our future work will involve the collection of the real breathing sound data of two people with PSG and generate synthesized data that is as realistic as possible using virtual space and chronotype.


S. Kim
D. Kim
S. Kim
H. Park
J. Hong
D. Lee


1. I confirm that the abstract and that all information is correct.: Yes
2. I confirm that the abstract constitutes consent to publication.: Yes
3. I confirm that I submit this abstract on behalf of all authors.: Yes
I understand that the presenting author MUST register for the congress.: Yes