• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

R语言应用于LAMOST光谱分析初探

Application of R language in LAMOST Spectral Analysis

  • 摘要: 以可扩展性极强的开源软件R程序语言为工具,发挥在统计学和数据挖掘领域强大的数据分析能力,重点研究R语言用于读写FITS格式文件软件包RFITSIO的主要功能和特点,并对LOMAST采集的FITS文件进行详细介绍,将海量LOMAST巡天光谱DR2数据用RFITSIO读出恒星光谱,并利用R语言的主成分分析工具提取各类型光谱数据的特征量即主成分。从含有大量冗余信息的光谱中提取代表恒星光谱特征的主要成分,通过采用主成分分析方法提取光谱特征,重构后能够有效降低原始光谱数据受噪声的影响,为后续数据挖掘工作提供研究基础。

     

    Abstract: The data mining research of large-scale survey is focused on handling, processing and extracting information from massive astronomical data. In this paper, we try to apply the extensible R programming language in LAMOST spectral analysis, and make full use of its capability of integrated data analysis and visualization methods. We mainly study the functions and characteristics of the RFITSIO package for reading and writing FITS format files in R. We then group the LAMOST DR2 data according to the released classification result, and the PCA package in R is applied in each group to extract spectral features from the large amount of noisy spectra. The result shows that, the spectral features are well kept through PCA reconstruction. By extracting the FLUX eigenvalues of the spectral signal description capability of each band in the spectrum, the PCA is used to extract the characteristic value of LAMOST. Rotating coordinate system to eliminate the correlation between the characteristics of the spectral resolution of the data, to reduce the dimensionality of data and remove the effect of noise. This dimensional reduction based feature extraction method can be a very efficient pre-processing approach for the follow-up data mining in LAMOST dataset.

     

/

返回文章
返回