Storing, processing, and transmitting state confidential information are strictly prohibited on this website
Dai Wei, Wang Sen, Li Qiuhong, Deng Hui, Mei Ying, Wang Feng. Implementation of SKA1-MID Self-calibrating Pipeline Based on Spark[J]. Astronomical Research and Technology, 2020, 17(3): 334-340.
Citation: Dai Wei, Wang Sen, Li Qiuhong, Deng Hui, Mei Ying, Wang Feng. Implementation of SKA1-MID Self-calibrating Pipeline Based on Spark[J]. Astronomical Research and Technology, 2020, 17(3): 334-340.

Implementation of SKA1-MID Self-calibrating Pipeline Based on Spark

More Information
  • Received Date: November 27, 2019
  • Revised Date: December 11, 2019
  • Available Online: November 20, 2023
  • The amount of the scientific data generated by the SKA exceeds the processing capabilities of all existing distributed processing systems. How to implement a distributed execution framework is an important research issue of scientific data processing. Based on Spark framework, one of the most mature execution frameworks, this study attempts to systematically analyze how to migrate iCal pipelines in the Algorithm Reference Library (ARL) to Spark. We analyze and discuss the implementation procedure and present the corresponding task flow implementation. The final experiments show that the results of the iCAL upon Spark is correct. In summary, Spark could meet the requirements of distributed data for certain data. The limitations of Spark itself severely restricts its application in SKA.
  • [1]
    CARILLI C, RAWLINGS S. Motivation, key science projects, standards and assumptions[J]. New Astronomy Reviews, 2004, 48(11/12):979-984.
    [2]
    HALL P J, SCHILIZZI R T, DEWDNEYP E F, et al. The square kilometer array (SKA) radio telescope:progress and technical directions[J]. Radio Science Bulletin, 2008, 2008(326):4-19.
    [3]
    TAYLOR A R. The square kilometre array[C]//Proceedings of the International Astronomical Union. 2007:164-169.
    [4]
    BROEKEMA P C, VAN NIEUWPOORT R V V, BAL H E. ExaScale high performance computing in the square kilometer array[C]//Proceedings of the 2012 workshop on High-Performance Computing for Astronomy Date. 2012:9-16.
    [5]
    赖铖, 梅盈, 邓辉, 等. MUSER可见度数据积分方法与实现[J]. 天文研究与技术, 2018, 15(1):78-86.
    [6]
    于晓雨, 邓辉, 梅盈, 等. 宽视场成像网格化算法中w-plane最优经验值研究[J]. 天文研究与技术, 2019, 16(2):218-224.
    [7]
    ZAHARIA M, CHOWDHURY M, FRANKLIN M J, et al. Spark:cluster computing with working sets[C]//Proceedings of the 2nd USENIX Conference on Hot topics in cloud computing HotCloud. 2010.
    [8]
    SPARK A. Apache Spark:lightning-fast unified analyticsengine[EB/OL].[2019-11-28]. http://spark.apache.org.
    [9]
    ROCKLIN M. Dask:parallel computation with blocked algorithms and task scheduling[C]//Proceedings of the 14th Python in Science Conference. 2015:130-136.
  • Articles Related

    [1]Ye Xinchen, Zhang Hailong, Wang Jie, Zhang Meng, Zhang Yazhou, Wang Wanqiong, Li Jia, Du Xu. Deployment and Testing of Pulsar Data Processing Software Based on Container [J]. Astronomical Techniques and Instruments, 2023, 20(2): 154-164. DOI: 10.14005/j.cnki.issn1672-7673.20230118.003
    [2]Yang Chao, Liang Bo, Dai Wei, Wei Shoulin, Deng Hui, Wang Feng. Research on Cone Search of Distributed MySQL for Massive Astronomical Data [J]. Astronomical Research and Technology, 2021, 18(3): 397-404. DOI: 10.14005/j.cnki.issn1672-7673.20201112.001
    [3]Zhang Hui, Xie Xiaoyao, Li Di, Liu Zhijie, Wang Pei, Yu Xuhong, You Shanping, Xu Yuyun, Jiang Jiatao. A Data Processing Acceleration Method and System for FAST Petabyte Pulsar Data Processing [J]. Astronomical Research and Technology, 2021, 18(1): 129-137. DOI: 10.14005/j.cnki.issn1672-7673.20200628.001
    [4]Li Li, Deng Hui, Li Zhen, Mei Ying, Dai Wei, Yang Qiuping, Wang Feng. Implementation on Image Frame Selection of GPU for ONSET Real-time Data Processing [J]. Astronomical Research and Technology, 2018, 15(2): 195-201.
    [5]Wei Shoulin, Liu Pengxiang, Wang Feng, Deng Hui, Liang Bo, Dai Wei. Real-Time Data Processing in Mingantu Ultrawide Spectral Radio Heliograph Based on Spark Streaming [J]. Astronomical Research and Technology, 2017, 14(4): 421-428.
    [6]Liang Bo, Chen Tengda, Yu Konglin, Liu Yingbo, Deng Hui, Liu Cuiyin, Wang Feng. Research on the Consistency of FITS Data in the Process of Distributed Real-time Storage [J]. Astronomical Research and Technology, 2016, 13(4): 489-497.
    [7]Wan Meng, Wu Chao, Ying Zhang, Xu Yang, Wei Jianyan. A Pre-research on GWAC Massive Catalog Data Storage and Processing System [J]. Astronomical Research and Technology, 2016, 13(3): 373-381.
    [8]Xing Shuguo, Su Yan, Zhou Jianfeng, Liu Jianzhong. Simulations of Processing of Data of Brightness Temperature Map of the Lunar Surface with the Maximum Entropy Method [J]. Astronomical Research and Technology, 2013, 10(3): 255-263.
    [9]LIANG Hong-fei, SU Tong-wei, ZHAO Hai-juan. Structure and Data Processing of the Solar Stokes Spectral Telescope [J]. Astronomical Research and Technology, 2007, 4(3): 249-257.
    [10]Wang Feng, Zhang Yuncheng. The Data Processing for the Photoelectric Locating System of the Transit [J]. Publications of the Yunnan Observatory, 1998, 0(2): 63-67.

Catalog

    Article views (73) PDF downloads (195) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return