• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

基于XGBoost的单脉冲信号识别研究

Research on Recognition Method of Single-pulse Signals Based on XGBoost

  • 摘要: 脉冲星搜寻是射电天文领域的重要研究方向。随着大型射电望远镜的不断建设和发展,数据量呈指数增长,如何及时从快速获取的海量数据中准确识别脉冲星信号成为当前面临的巨大挑战。以低频射电阵列(Low Frequency Array, LOFAR)联合阵列巡天项目的观测数据为例,设计了针对单脉冲信号识别的10个特征变量,进一步研究了XGBoost结合包裹式特征选择法在单脉冲信号识别中的应用,并对比分析了GBDT(Gradient Boosting Decision Tree)、AdaBoost、随机森林和BP(Back Propagation)神经网络等模型对单脉冲信号识别的效果。实验结果表明,XGBoost结合包裹式特征选择法在单脉冲信号识别方面更具综合优势,误分类率最低,分类结果的精确率、召回率与F1分数最高,平均高出其他模型1到2个百分点。从特征选择上来说,有9个特征被选为最优特征。本研究设计的特征变量和识别方法可为我国开展以500 m口径球面射电望远镜(Five-hundred-meter Aperture Spherical radio Telescope, FAST)探测信号为主的脉冲星搜寻提供方法和技术参考。

     

    Abstract: With the construction of large-scale radio telescopes, detecting pulsars from large-scale pulse signals has become an important task of space exploration. Machine learning algorithms are favored in single-pulse data analysis due to their data-driven advantages. However, algorithms used in pulsar searching cannot guarantee that their results are global optimal solutions. In this paper, eXtreme Gradient Boosting (XGBoost) method is studied in single pulse classification with the data from the LOFAR Tied-Array All-Sky Survey (LOTAAS). The LOTAAS is an ongoing survey of the Northern sky for pulsars and transients with LOFAR using a digital aperture array. In January 2019, the LOTAAS survey has discovered and confirmed 73 radio pulsars, which demonstrates its ability to find new pulsars. A fully labeled data set used for training and validation of the machine model is necessary. However, faced with massive amounts of astronomical observation data, it's time-consuming and laborious work to labeling data with manual inspection. In this study, we directly use the well-prepared data in the work of Michilli et al. (2018) for saving the labor of repetitive processing of data. In order to verify the performance of XGBoost method, this paper compares the algorithm with other four machine learning models. The results show that XGBoost combined with wrapped feature selection method has more advantages in single pulse recognition, with the lowest misclassification rate and the highest accuracy, and F1 score. This study has important implications for pulsar monitoring and can provide a reference for the research of single pulse search based on Five-hundred-meter Aperture Spherical radio Telescope (FAST) signals in China.

     

/

返回文章
返回