Академический Документы
Профессиональный Документы
Культура Документы
article info a b s t r a c t
Article history: Debris disks around stars other than the Sun have received significant attention in studies of exoplanets,
Received 10 February 2017 specifically exoplanetary system formation. Since debris disks are major sources of infrared emissions,
Accepted 25 February 2018 infrared survey data such as the Wide-Field Infrared Survey (WISE) catalog potentially harbors numerous
Available online 14 March 2018
debris disk candidates. However, it is currently challenging to perform disk candidate searches for over
Keywords:
747 million sources in the WISE catalog due to the high probability of false positives caused by interstellar
Debris disk matter, galaxies, and other background artifacts. Crowdsourcing techniques have thus started to harness
WISE citizen scientists for debris disk identification since humans can be easily trained to distinguish between
Machine learning desired artifacts and irrelevant noises. With a limited number of citizen scientists, however, increasing
Classification data volumes from large surveys will inevitably lead to analysis bottlenecks. To overcome this scalability
problem and push the current limits of automated debris disk candidate identification, we present a
novel approach that uses citizen science results as a seed to train machine learning based classification.
In this paper, we detail a case study with a computer-aided discovery pipeline demonstrating such
feasibility based on WISE catalog data and NASA’s Disk Detective project. Our approach of debris disk
candidates classification was shown to be robust under a wide range of image quality and features. Our
hybrid approach of citizen science with algorithmic scalability can facilitate big data processing for future
detections as envisioned in future missions such as the Transiting Exoplanet Survey Satellite (TESS) and
the Wide-Field Infrared Survey Telescope (WFIRST).
© 2018 Elsevier B.V. All rights reserved.
1. Introduction with circumstellar debris disks are often promising candidates for
exoplanet discovery due to the common origin of debris disks and
1.1. Background and motivation planets as well as their interactions (Janson et al., 2013; Kóspál et
al., 2009). Debris disks structures often have gaps and cavities, re-
Debris disks around stars other than the Sun, sometimes re- vealing the possible existence of exoplanets along with constraints
ferred to as exozodi or exozidacal dust due to the similarity to the on their properties (Janson et al., 2013; Greaves et al., 2005).
solar system’s zodiacal cloud, are of interest to scientists as they Multiple exoplanet discoveries have been made around stars with
are essential to understand the foundation of planetary systems debris disks in recent years (Lisse et al., 2007; Marois et al., 2008;
(Morales et al., 2009). These disks are believed to have been formed Liseau et al., 2010; Dodson-Robinson et al., 2011; Moór et al., 2013).
Positive correlation has been suggested between the presence of
by collisions between planets and planetsimals, remnants of the
exozodi and exoplanets (Kóspál et al., 2009; Raymond et al., 2011).
planetary formation process (Backman and Paresce, 1993b). Multi-
These findings provide information for future surveys dedicated
ple circumstellar debris disks have been directly imaged with high
to high-resolution imaging of debris disks to further understand
spatial resolution using the Hubble Space Telescope, such as the
the interaction and correlation between exoplanet and debris disks
circumstellar disks around Fomalhaut shown in Fig. 1. (Janson et al., 2013). In addition, understanding the properties of
Debris disks have gained significant interest among as- extrasolar debris disks is essential for the target selection pro-
tronomers due to their importance in exoplanet detection. Stars cess of future exoplanet direct imaging and spectroscopy missions
due to the dominant photon-noise produced by exozodiacal light
(Beichman et al., 2006; Weinberger et al., 2015; Kennedy et al.,
* Corresponding author.
E-mail address: tamz@mit.edu (T. Nguyen). 2014).
https://doi.org/10.1016/j.ascom.2018.02.004
2213-1337/© 2018 Elsevier B.V. All rights reserved.
T. Nguyen et al. / Astronomy and Computing 23 (2018) 72–82 73
Fig. 2. Summary of the image query framework, which uses data from the NASA exoplanet archive and the WISE image server to download infrared images of the planet–host
stars.
found through NASA’s Exoplanet Archive (NASA, 2017a). This list The centroid displacement can be found by comparing the
can be reduced to a list of planet–host stars and their correspond- location of the image first-moment and the image center, which
ing ra, dec. Lastly, a Python script reads the list of planet–host stars’ represents the target star location, as specified in image query. The
ra and dec and sends image requests to the corresponding URL for WISE imaging system diffraction scale was specified as 12′′ at 22
this star images. For each target, 4 images corresponding to the µm (Wright et al., 2010). This value is consistent with diffraction-
4 WISE bands are saved locally for subsequent image processing limited system estimates using the telescope aperture diameter
and analysis. Fig. 2 summarizes the data query process. To ensure and operating wavelength. In the case study, we added a 20%
that all images are downloaded correctly, a checking protocol was margin to the diffraction-limited scale to account for systematic
implemented to re-initiate the query process when an error occurs variations in the size of the point-spread functions. For each target,
on the image server, decide on which image cut-outs to keep, and a single centroid displacement parameter is computed as the aver-
remove duplicate image files. age displacement of the target in 4 WISE bands, with the exception
of images that include invalid pixel values. The second parameter,
2.2. Image processing and features extraction out-of-diffraction percentage, is applied only to images from WISE
band 4, to reduce the probability of miss due to bright stellar source
In this step, image processing and feature extraction techniques in shorter wavelength. Similar figures of merit are used in the
are applied to raw images from the WISE catalog to isolate the crowd-sourcing project Disk Detective. Fig. 4 shows the two figure-
target and extract relevant features for classification. The image of-merit parameters computed for a sample star target from the
processing pipeline includes three main steps: noise reduction, WISE catalog. Both original and processed images are presented
image segmentation, and central object isolation. In the noise along with figure-of-merit parameters for each WISE band and
reduction process, a threshold level is automatically selected using overall parameters for the target.
a simple mean of gray-scale values in the input image (Glasbey,
1993). Since only the central target is of interest, we implemented 2.3. Robustness evaluation benchmarks
an image segmentation technique to find the central object, if it
exists, and remove all other resolved objects in the image. To iden- A series of benchmark experiments were conducted to evaluate
tify each object, a watershed segmentation method was used with the performance of the image processing pipeline under vary-
markers at local maxima. The image processing implementations ing parameters, including signal-to-noise ratio, neighboring object
used in this analysis are part of the scikit-image Python package brightness and distance to the primary target object. The goals of
(van der Walt et al., 2014). Fig. 3 shows the image processing these benchmark experiments are not only to verify the image
pipeline as applied to a sample image with 2 objects, a main target processing approach for the case study application but also to
at the center of the image and a neighbor off-center object. The provide an evaluation framework for the method performance
image processing steps described above correctly transform the when applied to potential future data sets given data quality and
contaminated input image into a clean image of only the target of features.
interest. In this benchmark analysis, controlled images are generated,
Next, feature extractions are applied to the processed images to consisting of two circular objects: one primary target object and
enable binary decisions on the potential existence of circumstellar one secondary neighboring object, which acts as a contamination
disks. Images of an ideal star candidate with a potential debris-
source to the primary target. Gaussian random noise is added
disk would consist of a bright spot across all 4 bands at the image
to each image to simulate systematic noise source such as back-
center that does not extend beyond the diffraction-limit scale, as
ground noise and sensor noise. Next, the image processing tech-
described in Kuchner et al. (2016). To quantify these features, the
nique as described in Section 2.2 is applied to the image, which was
figure-of-merit parameters chosen for this analysis are:
designed to reduce the noise level and remove the secondary object
• The spot centroid displacement ∆c such that the primary target can be recovered. The output image
• The percentage of spot outside of the diffraction-limit scale is then compared with the image of the original primary target
p. object to determine whether the primary target has been identified
correctly. The experiment is repeated for multiple different values
The first parameter is used to check whether the location of the of signal-to-noise ratio (SNR), separation between the objects, and
infrared (IR) object, if it exists, coincides with that of the input relative brightness of the secondary object to the primary object.
target star to avoid misidentification. The second parameter en- The controlled images in this analysis were generated in Python,
sures that the IR object represents a single point source and is not where each object is modeled as a 2D Gaussian with standard vari-
contaminated by other extended IR sources. Similar parameters are ation σ , representing the point-spread function size. The resolution
employed by Disk Detective as metrics for debris disk detection. of the image is defined by the total number of pixels along one
T. Nguyen et al. / Astronomy and Computing 23 (2018) 72–82 75
Fig. 3. Illustration of the image processing pipeline as applied to a sample image with a central object and a secondary off-center object.
Fig. 4. Original WISE images and processed images of a sample target star in 4 WISE bands with corresponding figure-of-merit parameters for centroid displacement (∆c)
and out-of-diffraction percentage (p).
The benchmark results are shown in Fig. 7. Each subfigure Fig. 5. Illustration of benchmark image properties: (a) object separation ∆r, spot
size σ , number of pixels N, and (b) signal-to-noise ratio (SNR).
shows the contour plot of detection probability for a specific SNR
value with varying separation normalized to the image size (∆r /N)
and relative brightness (M1 /M2 ). The results show that detection
can be achieved reliably for ∆r /N > 0.3 (∆r > 3σ ). When the two processing pipeline. The detection probability is improved when
objects are too close together, their combined image resembles a M1 /M2 ≈ 1 and degrades when there is an imbalance between
single extended object and is treated correspondingly by the image the magnitudes of the 2 objects due to the nature of the local
76 T. Nguyen et al. / Astronomy and Computing 23 (2018) 72–82
Fig. 6. Example of a successful detection of a central object (SNR= 20 dB, N = 36 pixels, ∆r /N = 0.2, M2 /M1 = 0.5).
3. Classification approach
3.2. Classifier
Fig. 9. Representative images of planet–host star locations, queried from the WISE
image archive, illustrating a variety of features: (a) single central object in all bands,
(b) single central object, saturated in band 1–2, (c) one neighboring object resolved
in band 1–3, contamination in band 4, (d) multiple neighboring objects resolved
in band 1–2, noisy in band 3–4, (e) single object in band 1–2, neighboring object
Fig. 8. Decision boundaries generated with specified training sets from various resolved in band 3, contamination/misidentification in band 4, (f) single central
classifiers: (1) Support Vector Machine (SVM) with radial basis function (RBF) object in band 1–3, extended background object in band 4.
kernel, (2) SVM with linear kernel, (3) Gaussian Process, (4) Gaussian Naive Bayes
(NB), (5) Logistic Regression, and (6) Quadratic Discriminant Analysis (QDA).
Fig. 11. Classification with support vector machines (SVM) with radial basis
function (RBF) kernel (ν = 0.5). The classification boundary obtained by machine
learning separates the ‘‘good’’ and ‘‘bad’’ regions on the parameter space and shows
promising results for our known data. (For interpretation of the references to colour
in this figure legend, the reader is referred to the web version of this article.)
Fig. 13. A Graphical User Interface (GUI) showing candidate ranks and a visual inspection of processed images in the data discovery pipeline.
80 T. Nguyen et al. / Astronomy and Computing 23 (2018) 72–82
Table 1
Debris-disk candidates generated with computer-aided discovery approach.
# Star name/ID # Star name/ID # Star name/ID # Star name/ID # Star name/ID # Star name/ID
1 11 Com 63 HD 111232 125 HD 160691 187 HD 216437 249 HD 45364 311 HD 93083
2 14 And 64 HD 111998 126 HD 16175 188 HD 216536 250 HD 45652 312 HD 9446
3 16 Cyg B 65 HD 113337 127 HD 163607 189 HD 216770 251 HD 46375 313 HD 95086
4 18 Del 66 HD 114613 128 HD 16417 190 HD 217107 252 HD 47186 314 HD 95089
5 24 Sex 67 HD 114729 129 HD 164509 191 HD 217786 253 HD 4732 315 HD 95127
6 30 Ari B 68 HD 114783 130 HD 164595 192 HD 219415 254 HD 47536 316 HD 96063
7 4 UMa 69 HD 11506 131 HD 165155 193 HD 219828 255 HD 48265 317 HD 96127
8 42 Dra 70 HD 116029 132 HD 1666 194 HD 220074 256 HD 49674 318 HD 96167
9 47 UMa 71 HD 117207 133 HD 166724 195 HD 220689 257 HD 50499 319 HD 97658
10 51 Eri 72 HD 11755 134 HD 167042 196 HD 220773 258 HD 50554 320 HD 98219
11 51 Peg 73 HD 117618 135 HD 168443 197 HD 220842 259 HD 52265 321 HD 98649
12 6 Lyn 74 HD 118203 136 HD 1690 198 HD 221287 260 HD 5319 322 HD 99706
13 75 Cet 75 HD 11977 137 HD 169830 199 HD 222076 261 HD 5583 323 HIP 105854
14 8 UMi 76 HD 120084 138 HD 170469 200 HD 222155 262 HD 5608 324 HIP 107773
15 81 Cet 77 HD 121504 139 HD 17092 201 HD 222582 263 HD 564 325 HIP 116454
16 AB Pic 78 HD 12484 140 HD 171028 202 HD 224538 264 HD 5891 326 HIP 14810
17 BD+03 2562 79 HD 125612 141 HD 171238 203 HD 224693 265 HD 59686 A 327 HIP 57274
18 BD+15 2375 80 HD 12648 142 HD 17156 204 HD 23079 266 HD 60532 328 HIP 63242
19 BD+15 2940 81 HD 12661 143 HD 173416 205 HD 23127 267 HD 63454 329 HIP 65407
20 BD+20 1790 82 HD 126614 144 HD 175167 206 HD 23596 268 HD 65216 330 HIP 65426
21 BD+20 2457 83 HD 128356 145 HD 17674 207 HD 240210 269 HD 66141 331 HIP 65891
22 BD+20 274 84 HD 129445 146 HD 177565 208 HD 240237 270 HD 66428 332 HIP 67537
23 BD+48 738 85 HD 130322 147 HD 177830 209 HD 24040 271 HD 67087 333 HIP 67851
24 BD+49 828 86 HD 131496 148 HD 179079 210 HD 24064 272 HD 6718 334 HIP 70849
25 BD-06 1339 87 HD 131664 149 HD 179949 211 HD 25171 273 HD 68402 335 HIP 74890
26 BD-13 2130 88 HD 13189 150 HD 180314 212 HD 2638 274 HD 68988 336 HIP 78530
27 CT Cha 89 HD 132406 151 HD 180902 213 HD 27442 275 HD 70642 337 HIP 79431
28 DH Tau 90 HD 132563 152 HD 181342 214 HD 27631 276 HD 7199 338 HIP 8541
29 GJ 3470 91 HD 134987 153 HD 181433 215 HD 28185 277 HD 72659 339 HIP 91258
30 GJ 504 92 HD 136418 154 HD 183263 216 HD 28254 278 HD 73256 340 HIP 97233
31 GJ 676 A 93 HD 13908 155 HD 185269 217 HD 28678 279 HD 73267 341 HN Peg
32 GQ Lup 94 HD 13931 156 HD 187085 218 HD 29021 280 HD 73534 342 HR 2562
33 HAT-P-11 95 HD 139357 157 HD 187123 219 HD 290327 281 HD 74156 343 HR 8799
34 HAT-P-2 96 HD 14067 158 HD 18742 220 HD 2952 282 HD 7449 344 KELT-11
35 HD 100655 97 HD 141399 159 HD 188015 221 HD 30177 283 HD 75289 345 KELT-2 A
36 HD 100777 98 HD 141937 160 HD 189733 222 HD 30669 284 HD 75784 346 KELT-9
37 HD 10180 99 HD 142245 161 HD 190647 223 HD 30856 285 HD 75898 347 Kepler-21
38 HD 101930 100 HD 142415 162 HD 190984 224 HD 31253 286 HD 76700 348 Kepler-408
39 HD 102117 101 HD 143105 163 HD 191806 225 HD 32963 287 HD 7924 349 Kepler-409
40 HD 102195 102 HD 143361 164 HD 192263 226 HD 33142 288 HD 79498 350 Kepler-410 A
41 HD 102272 103 HD 145377 165 HD 192699 227 HD 33283 289 HD 80606 351 LkCa 15
42 HD 102329 104 HD 145457 166 HD 196050 228 HD 33564 290 HD 81040 352 NGC 2682 Sand 364
43 HD 102956 105 HD 145934 167 HD 196885 229 HD 33844 291 HD 81688 353 NGC 2682 Sand 978
44 HD 103197 106 HD 147513 168 HD 19994 230 HD 34445 292 HD 82886 354 ROXs 12
45 HD 103720 107 HD 147873 169 HD 200964 231 HD 35759 293 HD 82943 355 ROXs 42 B
46 HD 103774 108 HD 148427 170 HD 203030 232 HD 37605 294 HD 83443 356 TYC 3667-1280-1
47 HD 10442 109 HD 149026 171 HD 2039 233 HD 38283 295 HD 8535 357 TYC 4282-00605-1
48 HD 104985 110 HD 149143 172 HD 204313 234 HD 38529 296 HD 85390 358 WASP-18
49 HD 106252 111 HD 1502 173 HD 204941 235 HD 38801 297 HD 8574 359 WASP-33
50 HD 106270 112 HD 150706 174 HD 205739 236 HD 40307 298 HD 86081 360 WASP-7
51 HD 10647 113 HD 152581 175 HD 206610 237 HD 40979 299 HD 86226 361 WASP-8
52 HD 106906 114 HD 154857 176 HD 20782 238 HD 41004 A 300 HD 86264 362 bet Pic
53 HD 10697 115 HD 155233 177 HD 208487 239 HD 41004 B 301 HD 8673 363 kap And
54 HD 107148 116 HD 155358 178 HD 208527 240 HD 4113 302 HD 86950 364 ome Ser
55 HD 108147 117 HD 156279 179 HD 209458 241 HD 42012 303 HD 87646 365 omi CrB
56 HD 108341 118 HD 156411 180 HD 210702 242 HD 4203 304 HD 87883 366 tau Boo
57 HD 108863 119 HD 156668 181 HD 212301 243 HD 4208 305 HD 88133 367 xi Aql
58 HD 108874 120 HD 156846 182 HD 212771 244 HD 4313 306 HD 89307
59 HD 109246 121 HD 158038 183 HD 213240 245 HD 43197 307 HD 89744
60 HD 109271 122 HD 159243 184 HD 214823 246 HD 43691 308 HD 90156
61 HD 109749 123 HD 159868 185 HD 215497 247 HD 44219 309 HD 9174
62 HD 110014 124 HD 1605 186 HD 216435 248 HD 45350 310 HD 92788
Strasbourg, 2016). The resulting catalog contained approximately 1. The number of targets in the image recognized by the
2.5 million targets. processing algorithm with centers within the diffraction-
For a proof of concept, images of randomly selected subset of limited circles (‘‘Number of Central Targets’’ column in
these targets were downloaded in all available bands for the WISE, Fig. 13)
2MASS, and DSS surveys. Even though the DSS survey catalog was 2. The identifying coordinates of the target with the center
not included in the initial cross reference, it was found to have closest to the center of the image (‘‘Main Target Coordi-
imaged all targets in the subset catalog in at least one band. The nates’’)
image download is by far the rate limiting factor in developing 3. The displacement of the center of the central target from the
catalogs, and it is the main concern preventing scaling up to a larger image center (‘‘Main Target Displacement’’)
subset or analyzing the entire constructed catalog. 4. The percentage of the main target that extends beyond the
Each target is ranked based on the following metrics: diffraction-limited circle (‘‘Percent Outside Diffraction’’)
T. Nguyen et al. / Astronomy and Computing 23 (2018) 72–82 81
5. The value returned by the thresholding algorithm in the human classification records are going to increase the fidelity of
initial processing (‘‘Threshold Value’’) the approach presented in this paper.
6. The percentage of original pixels with values equal to or
above that threshold (‘‘Percent of Image White’’). Acknowledgments
A number of other data are included in the returned analysis We would like to thank Dr. Marc Kuchner for discussions
of each target for the purposes of identification, for example, the about Disk Detective as well as for providing data and access
right ascension and declination of the object as well as the file to the Disk Detective Web site. We also acknowledge support
path of its related unprocessed images and 0-indexed position in from the National Science Foundation ACI-1442997 and NASA
the input catalog. We generate a spreadsheet of data including all AIST14 NNX15AG84G for computer-aided discovery and the Na-
ranking metrics to allow researchers to select the most important tional Science Foundation Graduate Research Fellowship Grant No.
features to their research and/or to condition on other constraints, 1122374. The results of this paper are based on projects completed
such as identifying all candidates present in relatively noisy or during the 2015 graduate-level Astroinformatics course at MIT
quiet regions (determined by ‘‘Number of Central Targets’’, see Department of Earth, Atmospheric and Planetary Sciences (EAPS)
Fig. 13). and the 2015 summer MIT Undergraduate Research Opportuni-
The ranked data approach has proven useful for (a) quick elim- ties Program (UROP) supporting the work of Laura Eckman. This
ination of undesirable targets, and (b) metrics of fit somewhere research has made use of the NASA Exoplanet Archive, which is
between ideal candidates for follow up and bad targets to exclude operated by the California Institute of Technology, under contract
from future analysis. Ideally, it would be possible for researchers to with the National Aeronautics and Space Administration under the
identify the best candidates to look at in a given region of the sky Exoplanet Exploration Program.
to expedite follow up confirmation.
Appendix
5.2. Graphical user interface
See Table 1.
We developed a graphical user interface to allow scientists a
more hands-on overview of the analysis. Users can open catalogs,
download images, and run the discovery analysis on selected tar- References
gets of interest. Currently, the GUI analyzes a data set one target
Aumann, H., Beichman, C., Gillett, F., De Jong, T., Houck, J., Low, F., Neugebauer,
at a time, which allows users to easily and quickly view all of G., Walker, R., Wesselius, P., 1984. Discovery of a shell around Alpha Lyrae.
the existing information for a particular star without analyzing Astrophys. J. 278, L23–L27.
an entire catalog, providing a useful alternative to the large-scale Backman, D.E., Paresce, F., 1993a. Main-sequence stars with circumstellar solid
processing approach. The GUI also has the unique ability to gener- material-the Vega phenomenon. In: Protostars and planets III, Vol. 1,
pp. 1253–1304.
ate a display of the processed images associated with each stellar Backman, D., Paresce, F., 1993b. Protostars and Planets III, ed.
target — this functionality is useful for ‘‘debugging’’ and better Beichman, C., Bryden, G., Stapelfeldt, K., Gautier, T., Grogan, K., Shao, M., Velusamy,
understanding why certain candidates are chosen. The ability to T., Lawler, S., Blaylock, M., Rieke, G., et al., 2006. New debris disks around nearby
return to the processed images is useful in further examination main-sequence stars: impact on the direct detection of planets. Astrophys. J.
652 (2), 1674.
of anomalous data points as well as refinement of thresholds in
Chang, C.-C., Lin, C.-J., 2011. LIBSVM: a library for support vector machines. ACM
the analysis process. A screen shot is shown in Fig. 13, which also Trans. Intell. Syst. Technol. (TIST) 2 (3), 27.
illustrates a ranked list of candidates with their corresponding Chen, C.H., Sargent, B., Bohac, C., Kim, K., Leibensperger, E., Jura, M., Najita, J.,
ranking parameters. Forrest, W., Watson, D., Sloan, G., et al., 2006. Spitzer IRS spectroscopy of IRAS-
discovered debris disks. Astrophys. J. Suppl. Ser. 166 (1), 351.
Daniel, K., 2015. Disk detective: Crowdsourcing new planets. August 11. https:
6. Conclusion and outlook //www.citizenscience.gov/2015/08/11/disk-detective/.
Dodson-Robinson, S.E., Beichman, C., Carpenter, J.M., Bryden, G., 2011. A Spitzer
This article presents a study on a computer-aided approach infrared spectrograph study of debris disks around planet-host stars. Astron.
for debris disk candidate search that leverages data from several J. 141 (1), 11.
Glasbey, C.A., 1993. An analysis of histogram-based thresholding algorithms. CVGIP,
astronomical databases and crowd-sourcing results. The case study Graph. Models Image Process. 55 (6), 532–537.
presented here employs crowd-sourcing data from NASA’s Disk Greaves, J., Holland, W., Wyatt, M., Dent, W., Robson, E., Coulson, I., Jenness, T.,
Detective project to train machine learning algorithms and achieve Moriarty-Schieven, G., Davis, G., Butner, H., et al., 2005. Structure in the ε Eridani
scalability on a larger data set, providing predictions of debris disk debris disk. Astrophys. J. Lett. 619 (2), L187.
Høg, E., Fabricius, C., Makarov, V.V., Urban, S., Corbin, T., Wycoff, G., Bastian, U.,
candidates from the list of all planet–host stars. Furthermore, the
Schwekendiek, P., Wicenec, A., 2000. The Tycho-2 catalogue of the 2.5 million
algorithmic approach also facilitates the easy creation of ranking brightest stars. Astron. Astrophys. 355, L27–L30.
lists with debris disk candidates, which can help human scientists Janson, M., Brandt, T.D., Moro-Martín, A., Usuda, T., Thalmann, C., Carson, J.C., Goto,
query candidates with different properties and assess potential M., Currie, T., McElwain, M., Itoh, Y., et al., 2013. The SEEDS direct imaging survey
probabilities for detections in follow-up studies. To our knowledge, for planets and scattered dust emission in debris disk systems. Astrophys. J.
773 (1), 73.
this is the first attempt to combine crowd sourcing and machine Kennedy, G.M., Wyatt, M.C., Bailey, V., Bryden, G., Danchi, W.C., Defrère, D., Haniff,
learning for debris disk search in infrared bands in the presented C., Hinz, P.M., Lebreton, J., Mennesson, B., et al., 2015. EXO-zodi modeling for
way. the large binocular telescope interferometer. Astrophys. J. 216 (2), 23.
With upcoming missions such as the Transiting Exoplanet Sur- Kóspál, Á., Ardila, D.R., Moór, A.,Ábrahám, P., 2009. On the relationship between
debris disks and planets. Astrophys. J. Lett. 700 (2), L73.
vey Satellite (TESS), the Wide-Field Infrared Survey Telescope
Kuchner, M., 2016. NASA Disk Detective. https://www.diskdetective.org, last ac-
(WFIRST), as well as the James Webb Space Telescope (JWST), the cessed September 2016.
astronomical data volume will inevitably lead to a stronger need Kuchner, M.J., Silverberg, S.M., Bans, A.S., Bhattacharjee, S., Kenyon, S.J., Debes, J.H.,
for automation. The case study presented in this work demonstrate Currie, T., Garcia, L., Jung, D., Lintott, C., et al., 2016. Disk detective: Discovery of
that this hybrid approach has the potential to provide target candi- new circumstellar disk candidates through citizen science. Astrophys. J. 830 (2),
84.
dates for these missions as well as to support astronomical discov- Kuchner, M.J., Silverberg, S., Bans, A., Team, D.D., 2015. Diskdetective.org: The first
eries from future data. Lastly, as more data is being collected and 1,000,000 classifications. In: American Astronomical Society Meeting Abstracts,
validated, better training sets for machine learning and additional Vol. 225.
82 T. Nguyen et al. / Astronomy and Computing 23 (2018) 72–82
Liseau, R., Eiroa, C., Fedele, D., Augereau, J.-C., Olofsson, G., González, B., Maldonado, Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,
J., Montesinos, B., Mora, A., Absil, O., et al., 2010. Resolving the cold debris M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Courna-
disc around a planet-hosting star-PACS photometric imaging observations of peau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn: Machine
q1 Eridani (HD 10647, HR 506). Astron. Astrophys. 518, L132. learning in Python. J. Mach. Learn. Res. 12, 2825–2830.
Lisse, C., Beichman, C., Bryden, G., Wyatt, M., 2007. On the nature of the dust in the Raymond, S.N., Armitage, P.J., Moro-Martín, A., Booth, M., Wyatt, M.C., Armstrong,
debris disk around HD 69830. Astrophys. J. 658 (1), 584. J.C., Mandell, A.M., Selsis, F., West, A.A., 2011. Debris disks as signposts of
Marois, C., Macintosh, B., Barman, T., Zuckerman, B., Song, I., Patience, J., Lafrenière, terrestrial planet formation. Astron. Astrophys. 530, A62.
D., Doyon, R., 2008. Direct imaging of multiple planets orbiting the star HR 8799. Su, K., Rieke, G., Misselt, K., Stansberry, J., Moro-Martin, A., Stapelfeldt, K., Werner,
Science 322 (5906), 1348–1352. M., Trilling, D., Bendo, G., Gordon, K., et al., 2005. The Vega debris disk: A surprise
Moór, A., Ábrahám, P., Kóspál, Á., Szabó, G.M., Apai, D., Balog, Z., Csengeri, T.,
from Spitzer. Astrophys. J. 628 (1), 487.
Grady, C., Henning, T., Juhász, A., et al., 2013. A resolved debris disk around the
University of Strasbourg, 2016. CDS X-match service. http://cdsxmatch.u-strasbg.
candidate planet-hosting star HD 95086. Astrophys. J. Lett. 775 (2), L51.
fr/xmatch (last accessed 05.09.16).
Morales, F.Y., Werner, M., Bryden, G., Plavchan, P., Stapelfeldt, K., Rieke, G., Su, K.,
van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager,
Beichman, C., Chen, C., Grogan, K., et al., 2009. Spitzer mid-IR spectra of dust
N., Gouillart, E., Yu, T., 2014. scikit-image: image processing in Python. PeerJ 2,
debris around A and late B type stars: asteroid belt analogs and power-law dust
distributions. Astrophys. J. 699 (2), 1067. e453.
NASA, 2017a. NASA exoplanet archive. http://exoplanetarchive.ipac.caltech.edu/ Weinberger, A.J, Bryden, G., Kennedy, G.M., Roberge, A., Defrère, D., Hinz, P.M.,
docs/program_interfaces.html (last accessed 16.08.17). Millan-Gabet, R., Rieke, G., Bailey, V.P., Danchi, W.C., et al., 2015. Target selection
NASA, 2017b. NASA/IPAC Infrared Science Archive. http://irsa.ipac.caltech.edu/ibe/ for the LBTI exozodi key science program. Astrophys. J. 216 (2), 24.
index.html (last accessed 16.08.17). Wright, E.L., Eisenhardt, P.R., Mainzer, A.K., Ressler, M.E., Cutri, R.M., Jarrett, T.,
Pankratius, V., Li, J., Gowanlock, M., Blair, D.M., Rude, C., Herring, T., Lind, F., Erickson, Kirkpatrick, J.D., Padgett, D., McMillan, R.S., Skrutskie, M., et al., 2010. The Wide-
P.J., Lonsdale, C., 2016. Computer-aided discovery: Toward scientific insight field Infrared Survey Explorer (WISE): mission description and initial on-orbit
generation with machine support. IEEE Intell. Syst. 31 (4), 3–10. performance. Astron. J. 140 (6), 1868.