Hunting for exocomet transits in the TESS database using the Random Forest method

1Dobrycheva, DV, 1Vasylenko, MYu., 1Kulyk, IV, 1Pavlenko, Ya.V, 2Shubina, OS, 3Luk’yanyk, IV, 1Korsun, PP
1Main Astronomical Observatory of the National Academy of Sciences of Ukraine, Kyiv, Ukraine
2Main Astronomical Observatory of the National Academy of Sciences of Ukraine, Kyiv, Ukraine; Astronomical Institute of Slovak Academy of Sciences, Tatranska Lomnica, Slovak Republic
3Astronomical Observatory of the Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Space Sci. & Technol. 2023, 29 ;(6):068-079
https://doi.org/10.15407/knit2023.06.068
Publication Language: English
Abstract: 
This study introduces an approach to detecting exocomet transits in the dataset of the Transiting Exoplanet Survey Satellite (TESS), specifically within its Sector 1. Given the limited number of exocomet transits detected in the observed light curves, creating a sufficient training sample for the machine learning method was challenging. We developed a unique training sample by encapsulating simulated asymmetric transit profiles into observed light curves, thereby creating realistic data for the model training. To analyze these light curves, we employed the TSFresh software, which was a tool for extracting key features that were then used to refine our Random Forest model training.
           Considering that cometary transits typically exhibit a small depth, less than 1% of the star's brightness, we chose to limit our sample to the CDPP parameter. Our study focused on two target samples: light curves with a CDPP of less than 40 ppm and light curves with a CDPP of up to 150 ppm. Each sample was accompanied by a corresponding training set. This methodology achieved an accuracy of approximately 96%, with both precision and recall rates exceeding 95% and a balanced F1-score of around 96%. This level of accuracy was effective in distinguishing between 'exocomet candidate' and 'non-candidate' classifications for light curves with a CDPP of less than 40 ppm, and our model identified 12 potential exocomet candidates. However, when applying machine learning to less accurate light curves (CDPP up to 150 ppm), we noticed a significant increase in curves that could not be confidently classified, but even in this case, our model identified 20 potential exocomet candidates.
            These promising results within Sector 1 motivate us to extend our analysis across all TESS sectors to detect and study comet-like activity in the extrasolar planetary systems.
References: 

1. Ansdell M., Ioannou Y., Osborn H. P., et al. (2018). Scientific Doma in Knowledge Improves Exoplanet Transit Classification with Deep Learning. Astrophys. J. Lett., 869, No. 1, article id. L7, 7 p.
https://doi.org/10.3847/2041-8213/aaf23b

2. Beust H., Lagrange-Henri A. M., Vidal-Madjar A., Ferlet R. (1989). The beta Pictoris circumstellar disk. IX. Theoretical results on the infall velocities of CA II, AI III and MG II. Astron. Astrophys., 223, 304-312. Bibcode: 1989A&A...223..304B

3. Borucki W. J., Koch D., Basri G., et al. (2010). Kepler Planet-Detection Mission: Introduction and First Results. Science, 327, No. 5968, 977-980.
https://doi.org/10.1126/science.1185402

4. Breiman L. (2001). Random Forests. Machine Learning, 45, 5-32.
https://doi.org/10.1023/A:1010933404324

5. Brogi M., Keller, C. U., de Juan Ovelar M. et al. (2012). Evidence for the disintegration of KIC 12557548 b. Astron. Astrophys., 545, id. L5, 4 p.
https://doi.org/10.1051/0004-6361/201219762

6. Christ M., Braun N., Neuffer J., Kempa-Liehr A. W. (2018). Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh - A Python package). Neurocomputing, 307, 72-77.
https://doi.org/10.1016/j.neucom.2018.03.067

7. Christiansen J. L., Jenkins J. M., Caldwell D. A., et al. (2012). The Derivation, Properties, and Value of Kepler's Combined Differential Photometric Precision. Publ. Astron. Soc. Pacif., 124, No. 922, 1279.
https://doi.org/10.1086/668847

8. Coughlin J. L., Mullally F., Thompson S. E., et al. (2016). Planetary Candidates Observed by Kepler. VII. The First Fully Uniform Catalog Based on the Entire 48-month Data Set (Q1-Q17 DR24). Astrophys. J. Suppl. Ser., 224, No. 1, article id. 12, 27 p.
https://doi.org/10.3847/0067-0049/224/1/12
https://doi.org/10.1016/10.3847/0067-0049/224/1/12

9. Fisher C., Hoeijmakers H. J., Kitzmann D. (2020). Interpreting High-resolution Spectroscopy of Exoplanets using Crosscorrelations and Supervised Machine Learning. Astron. J., 159, No. 5, id.192, 15 p.
https://doi.org/10.3847/1538-3881/ab7a92

10. Gilliland R. L., Chaplin W. J., Dunham E. W., et al. (2011). Kepler Mission Stellar and Instrument Noise Properties. Astrophys. J. Suppl. Ser., 197, No. 1, article id. 6, 19 p.
https://doi.org/10.1088/0067-0049/197/1/6

11. Guerrero N. M., Seager S., Huang C. X., et al. (2021). The TESS Objects of Interest Catalog from the TESS Prime Mission. Astrophys. J. Suppl. Ser., .254, No. 2, id.39, 29 p.
https://doi.org/10.3847/1538-4365/abefe1

12. Howell S. B., Sobeck C., Haas M., et al. (2014). The K2 Mission: Characterization and Early Results. Publ. Astron. Soc. Pacif., 126, No. 938, 398.
https://doi.org/10.1086/676406

13. Kennedy G. M., Hope G., Hodgkin S. T., Wyatt M. C. (2019). An automated search for transiting exocomets. Mon. Notic. Roy. Astron. Soc., 482, No. 4, 5587-5596.
https://doi.org/10.1093/mnras/sty3049

14. Kiefer F., Lecavelier des Etangs A., Boissier J., et al. (2014). Two families of exocomets in the β Pictoris system. Nature, 514, No. 7523, 462-464.
https://doi.org/10.1038/nature13849

15. Khramtsov V., Vavilova I. B., Dobrycheva D. V., et al. (2022). Machine learning technique for morphological classification of galaxies from the SDSS. III. Image-based inference of detailed features. Space Science and Technology, 28, No. 5, 27-55.

https://doi.org/10.1051/0004-6361/202038981
https://doi.org/10.1051/0004-6361/202038981

16. Kumar C. K., Davila J. M., Rajan R. S. (1989). The Accretion of Interplanetary Dust by AP and AM Stars. Astrophys. J., 337, 414.
https://doi.org/10.1086/167112

17. Lecavelier des Etangs A., Cros L., Hebrard G., et al. (2022). Exocomets size distribution in the β Pictoris planetary system. Scientific Reports, 12, article id. 5855.
https://doi.org/10.1038/s41598-022-09021-2
https://doi.org/10.21203/rs.3.rs-1236390/v1

18. Lecavelier Des Etangs A., Vidal-Madjar A., Ferlet R. (1999). Photometric stellar variation due to extra-solar comets. Astron. and Astrophys., 343, 916-922.
https://doi.org/10.1051/aas:1999114

19. Li Xin, Li Jian, Xia, Zhihong Jeff, Georgakarakos N. (2022). Machine-learning prediction for mean motion resonance behaviour - The planar case. Mon. Notic. Roy. Astron. Soc., 511, No. 2, 2218-2228.
https://doi.org/10.1093/mnras/stac166

20. Li Xin, Li Jian, Xia Zhihong Jeff, Georgakarakos N. (2023). Large-step neural network for learning the symplectic evolution from partitioned data. Mon. Notic. Roy. Astron. Soc., 524, No. 1, 1374-1385.
https://doi.org/10.1093/mnras/stad1948

21. Malik A., Moster B. P., Obermeier C. (2022). Exoplanet detection using machine learning. Mon. Notic. Roy. Astron. Soc., 513, No. 4, 5505-5516.
https://doi.org/10.1093/mnras/stab3692

22. Melton E. J., Feigelson E. D., Montalto M . (2023). DIAmante TESS AutoRegressive Planet Search (DTARPS). I. Anal ysis of 0.9 Million Light Curves.
https://doi.org/10.48550/arXiv.2302.06700

23. McCauliff S. D., Jenkins J. M., Catanzarite J., et al. (2015). Automatic Classification of Kepler Planetary Transit Candidates. Astrophys. J., 806, No. 1, article id. 6, 13 p.
https://doi.org/10.1088/0004-637X/806/1/6

24. Mislis D., Bachelet E., Alsubai K. A., Bramich D. M., Parley N. (2016). SIDRA: a blind algorithm for signal detection in photometric surveys. Mon. Notic. Roy. Astron. Soc., 455, No. 1, 626-633.
https://doi.org/10.1093/mnras/stv2333

25. M rquez-Neila P., Fisher C., Sznitman R. (2018). Supervised machine learning for analysing spectra of exoplanetary atmospheres. Nature Astron., 2, 719-724.
https://doi.org/10.1038/s41550-018-0504-2

26. Osborn H. P., Ansdell M., Ioannou Y., et al. (2020). Rapid classification of TESS planet candidates with convolutional neural networks. Astron. and Astrophys., 633, id.A53, 11 p.
https://doi.org/10.1051/0004-6361/201935345

27. Pavlenko Y., Kulyk I., Shubina O., et al. (2022). New exocomets of β Pic. Astron. and Astrophys., 660, id. A49, 8 p.
https://doi.org/10.1051/0004-6361/202142111

28. Pedregosa, F., Varoquaux, Ga»el, Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. J. Machine Learning Res., 12, 2825-2830.
https://doi.org/10.48550/arXiv.1201.0490

29. Rappaport S., Vanderburg A., Jacobs T., et al. (2018). Likely transiting exocomets detected by Kepler. Mon. Notic. Roy. Astron. Soc., 474, No. 2, 1453-1468.
https://doi.org/10.1136/bmj.2.1722.1453-a
https://doi.org/10.1093/mnras/stx2735

30. Rebollido I., Eiroa C., Montesinos B., et al. (2020). Exocomets: A spectroscopic survey. Astron. and Astrophys., 639, id.A11, 59 p.
https://doi.org/10.1051/0004-6361/201936071

31. Ricker G. R., Winn J. N., Vanderspek R., et al. (2015). Transiting Exoplanet Survey Satellite (TESS). J. Astron. Telescopes, Instruments, and Systems, 1, id. 014003
https://doi.org/10.1117/1.JATIS.1.1.014003

32. Salinas H., Pichara K., Brahm R., Perez-Galarce F., Mery D. (2023). Distinguishing a planetary transit from false positives: a Transformer-based classification for planetary transit signals. Mon. Notic. Roy. Astron. Soc., 522, No. 3, 3201-3216.
https://doi.org/10.1093/mnras/stad1173

33. Shallue C. J., Vanderburg A. (2018). Identifying Exoplanets with Deep Learning: A Five-planet Resonant Chain around Kepler-80 and an Eighth Planet around Kepler-90. Astron. J., 155, No. 2, article id. 94, 21 p.
https://doi.org/10.3847/1538-3881/aa9e09

34. Tey E., Moldovan D., Kunimoto M., et al. (2023). Identifying Exoplanets with Deep Learning. V. Improved Light-curve Classification for TESS Full-frame Image Observations. Astron. J., 165, No. 3, id.95, 19 p.
https://doi.org/10.3847/1538-3881/acad85

35. Van Cleve J. E., Howell S. B., Smith J. C., et al. (2016). That's How We Roll: The NASA K2 Mission Science Products and Their Performance Metrics. Publ. Astron. Soc. Pacif., 128, No. 965, 075002.
https://doi.org/10.1088/1538-3873/128/965/075002

36. Vasylenko M., Pavlenko Ya., Dobrycheva D. et al. (2022). An algorithm for automatic identification of asymmetric transits in the TESS database. Multi-Scale (Time and Mass) Dynamics of Space Objects. Held 18-22 October, 2021 in Iaşi, Romania. Proc. Int. Astron. Union, 364, 264-266.
https://doi.org/10.1017/S1743921322000023

37. Vavilova I. B., Dobrycheva D. V., Vasylenko M. Yu., et al. (2021). Machine learning technique for morphological classification of galaxies from the SDSS. I. Photometry-based approach. Astron. and Astrophys., 648, id.A122, 14 p.
https://doi.org/10.1051/0004-6361/202038981

38. Wyatt M. C., van Lieshout R., Kennedy G. M., Boyajian T. S. (2018). Modelling the KIC8462852 light curves: compatibility of the dips and secular dimming with an exocomet interpretation. Mon. Notic. Roy. Astron. Soc., 473, No. 4, 5286-5307.
https://doi.org/10.1093/mnras/stx2713

39. Zheng A. (2023). Developing a high-performance approach to exoplanet prediction through light-curve analysis using the transit method. Bull. Amer. Astron. Soc., 55, No. 6, e-id 2023n6i401p03 Bibcode: 2023AAS...24240103Z

40. Zieba S., Zwintz K., Kenworthy M. A., Kennedy G. M. (2019). Transiting exocomets detected in broadband light by TESS in the β Pictoris system. Astron. and Astrophys., 625, id.L13, 7 p.
https://doi.org/10.1051/0004-6361/201935552