Ep. 637: Machine Learning in Astronomy

Computers are a big part of astronomy, but mostly they’ve been relegated to doing calculations. Recent developments in machine learning have changed everything, giving computers the ability to do jobs that humans could only do in the past.

Download MP3 | Show Notes | Transcript

Show Notes

Presbyopia (Mayo Clinic)

What is Machine Learning? (IBM)

What is Artificial Intelligence? (IBM)

Background On Bennu Mappers: Rocks All The Way Down (CosmoQuest)

Galaxy Zoo (Zooniverse)

Sloan Digital Sky Survey

Dogs, Wolves, Data Science, and Why Machines Must Learn Like Humans Do (Hackernoon)

What is Cloud Computing? (IBM)


What is Facial Recognition – Definition and Explanation (Kaspersky)

Explained: Neural networks (MIT)

A Beginner’s Guide to Generative Adversarial Networks (GANs) (Pathmind)

This amazing AI tool lets you create human faces from scratch (Fast Company)

Twitter’s Photo Crop Algorithm Favors White Faces and Women (Wired)

Legacy Survey of Space and Time (LSST) (Stanford University)

PDF: Machine Learning in Space Weather (NOAA)

How one woman is using machine learning to help NASA track asteroids (Google)

New Deep Learning Method Adds 301 Planets to Kepler’s Total Count (NASA JPL)

Using Machine Learning to Help Track Bolides (Really Bright Meteors) (SETI Institute)

Vera Rubin Observatory

Cisco Systems

BL Lacertae (AAVSO)

Hypothetical Planet X (NASA)

Dark Energy Survey

Fast Radio Bursts (Swinburne University)

MeerKAT radio telescope (SARAO)

Hubble Eyes Hanny’s Voorwerp (Universe Today)

Back to Top


Transcriptions provided by GMR Transcription Services

Frasier:                        AstronomyCast Episode 637. Machine Learning in Astronomy. Welcome to AstronomyCast, your weekly facts-based journey through the cosmos where we help you understand not only what we know but how we know what we know. I’m Frasier Cain, publisher of Universe Today, and with me as always, Dr. Pamela Gay, a senior scientist for the Planetary Science Institute, and the director of CosmoQuest. Hey Pamela! How you doing?

Dr. Pamela Gay:         I am doing well. I have given in to presbyopia, and I have gotten glasses, and –

Frasier:                        Yeah.

Dr. Pamela Gay:         – as someone put it, the trees, they now have leaves.

Frasier:                        I’m ready. I’m ready. It’s my time as well. There’s only so far my arms will let me –

Dr. Pamela Gay:         Yeah.

Frasier:                        – stretch my phone away from my face. Yeah. Yeah, getting old sucks.

Dr. Pamela Gay:         But presbyopia is such a cool word. So, if you’re gonna have something go wrong, may it at least have a really cool word to describe it.

Frasier:                        Right, yeah. If only all medical and I guess age conditions were like that.

Dr. Pamela Gay:         Yeah.

Frasier:                        Computers are a big part of Astronomy. But mostly, they’ve been relegated to doing calculations. But recent developments in machine learning have changed everything, giving computers the ability to do jobs that humans could only do in the past. I guess it’s not that surprising that computers, machine learning specifically, are starting to show up across the field of Astronomy. Computers have been tied in so closely to everything that astronomers do that it’s just a matter of time before I guess the artificial intelligence shows up to really run things.

Dr. Pamela Gay:         Well, and it’s getting to the point where the rate at which we’re acquiring data is growing exponentially.

Frasier:                        Yeah.

Dr. Pamela Gay:         And the number of astronomers in the field is not growing. And so, since we don’t have more astronomers to deal with all of the data, we’ve really got three choices: Students – again, not enough of them – the general public – and we do totally use the general public. Go listen to one of our episodes on citizen science. But again, not enough of them, ‘cause –

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         – exponential growth. And this is where we have to turn to computers and hope the processing speeds continue to grow at a rate that allows us to keep up with that flux of data.

Frasier:                        Now, it’s funny, you are uniquely positioned to talk about this situation, because you have been at the forefront of citizen science of identifying things that computers can’t do that humans can do, and are happy to do, or will begrudgingly do if you let them know that the science is important, and that –

Dr. Pamela Gay:         Yeah.

Frasier:                        – these rocks really do need to be mapped.

Dr. Pamela Gay:         Hello!

Frasier:                        Hello rocks! But just this idea that there’s things that computers are really bad at. And so, –

Dr. Pamela Gay:         Yeah.

Frasier:                        – let’s organize a whole bunch of human beings to do the things that humans can still do that computers can’t do. Is that landscape starting to change from your perspective?

Dr. Pamela Gay:         It’s changing in terms of how we are interfacing with volunteers. Once upon a time back in the days that the galaxy’s new project entered the battlefield of human versus dataset, there was an issue where the Sloan Digital Sky Survey had so many hundreds of thousands of galaxies in it that there was no way poor Kevin Schminsky was going to as part of his dissertation get through hand marking all of the galaxies –

Frasier:                        Yeah.

Dr. Pamela Gay:         – without madness occurring.

Frasier:                        Right. It is a job for grad students. But still, –

Dr. Pamela Gay:         Yeah.

Frasier:                        – there weren’t enough of those.

Dr. Pamela Gay:         And at that point in time, software was nowhere near being capable of being trained to mark galaxies. This was about the point that one of my colleagues when she was working on her dissertation back in the days of Myspace and LiveJournal posted something along the lines of, “My software has discovered that wherever there are lizards, there are sailboats. Except sometimes, there are mountains.”

Because the training set for the software it turned out where they were marking lizard, lizard, lizard, all of the triangles in the training set were sailboats. Except when they got to other data sets. There were triangles that were mountains. And so, we don’t always know what we have used to train our machine learning algorithms. And they can make fabulous mistakes.

Frasier:                        Right.

Dr. Pamela Gay:         “Wolves are dogs that have snow around them” came the results of one algorithm.

Frasier:                        Right.

Dr. Pamela Gay:         And so, at that point in time, we couldn’t get software to say, “This is a galaxy, and this is what the galaxy is doing.” Now, luckily with the advent of better and better processing, and more importantly, the advent of cloud computing, we are able to spin up special systems that are designed to learn better, faster… not cheaper. They’re so expensive. But better and –

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         – faster, –

Frasier:                        Yeah.

Dr. Pamela Gay:         – by taking the training sets from human beings, and saying, “Okay, based on this, I actually now have the capacity to figure out what galaxies are.” So now, instead of having strictly the human being saying, “This is a galaxy that is behaving in this way,” the software goes through and goes, “This one is shaped like this. This one is shaped like this. This one, low probability on my understanding it. Kick it back to the humans.”

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         So, the humans are now the trainers and the fixers. And –

Frasier:                        Hmm.

Dr. Pamela Gay:         – this frees up the humans to work on solving more and more problems. And this is useful, because there are some problems like we had with Bennu, where the entire data set that you need to classify to in this case, find a safe place for OSIRIS-REx to boop an asteroid, the entire dataset wasn’t big enough to train an algorithm. So, the humans had to do the entire thing. But when the dataset is big enough, now the humans just train it, and then fix its mistakes.

Frasier:                        So, let’s talk about fundamentally how… I guess how machine learning works, and then sort of how it’s used in Astronomy.

Dr. Pamela Gay:         Machine learning is very much a black box. Even the people who do machine learning, although some of the latest articles are hinting at new ways to look in at what the computers are making their choices off of. But the basic idea is there are algorithms that when you show them an image that has the thing that you are interested in in the image – so, you have to divide things up into stamps for the software still – so, you divide it up.

And the software then looks at your image, and says, “Based on the 10,000 – 1,000,000 images like this that I’ve looked at, I am going to say it’s doing facial recognition on a human,” look at the separation of the eyes, the eye-to-chin ratio, and look at all of these different things, and identify who is in the image, what is in the image. And the places where we still run into trouble – and machine learning gets used for many other things. I’m focused very very narrowly on computer vision right now –

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         – where the software still struggles is we can’t just give it a mosaic of the moon and say, “Go forth and find all of the craters.” We have to instead identify all of the craters through some other means, and then, once the craters are identified, try and get software that will correctly identify the edges of –

Frasier:                        Right.

Dr. Pamela Gay:         – craters. And we can still only get that at the 90-something percent accuracy level, which isn’t good enough for a lot of what –

Frasier:                        No.

Dr. Pamela Gay:         – we’d like to do.

Frasier:                        Yeah, and underlying the way these algorithms work – and I mean, there are whole podcasts that go into how machine learning algorithms –

Dr. Pamela Gay:         Yeah.

Frasier:                        – work – but generally, you’ve got a neural network – and again, that’s a whole other conversation –

Dr. Pamela Gay:         Mm-hmm.

Frasier:                        – that you are teaching it how to do various things. And so, you show it a picture, and you say, “This is a crater.” And then, you showed another picture and go, “This is a crater.” And then, you show it a million more, and go, “These are craters.” And that’s sort of one part to it. And then, the other part that sort of has really kind of taken off is this idea of an adversarial network.

Dr. Pamela Gay:         Yes.

Frasier:                        So, you have one computer whose job is to recognize craters, and then another computer whose job is to tell whether or not the first computer is doing a good job of recognizing craters. And so, one is good at generating something, and the other one is good at recognizing something. You can set those two against each other. And that’s how you get those human faces generated by artificial intelligence with these janitor of adversarial networks. But just in general, it’s this process. And as you say, if you’re lucky, it could be automated. But you’re never lucky, and it’s always a human being is patiently explaining to a computer, –

Dr. Pamela Gay:         Mm-hmm.

Frasier:                        – “This is a crater. This is not a crater. This is a crater. Do you see the crater over here, computer? This is a crater.” So, humans are deeply involved in this process.

Dr. Pamela Gay:         And the reason that craters are something that it struggles with so incredibly much is because soils can be completely different colors from one place to another. You can have varying shadows. And the soil can be different textures. And computer vision has to be given a training set as diverse as the population it’s eventually going to be working on. Twitter ran into severe problems with their image algorithm where if you give Twitter too large of an image in one aspect ratio or another, it’s going to only show the person in your Twitter feed a section of that image. And they have a really good algorithm for figuring out where the face of a White person is.

Frasier:                        Right, yeah. So, it’s garbage in, garbage out?

Dr. Pamela Gay:         Yeah. And so, we have to be cognizant of good datasets. And so, to return to the lunar problem, if you go through and you do an excellent job marking every crater in an Apollo region, well it’s either gonna all be MARA or all be regolith, which means it’s all gonna be dark soil or light soil. And that algorithm’s only gonna be able to work on that particular kind of the moon –

Frasier:                        Right.

Dr. Pamela Gay:         – during that particular lighting condition. So, getting a diverse enough training sample can be hard for some problems.

Frasier:                        Yeah.

Dr. Pamela Gay:         Other problems, it’s super easy.

Frasier:                        It’s funny, that concept of having a difficult time also changes even if you just change the parameters. If you take –

Dr. Pamela Gay:         Yes.

Frasier:                        – a picture of a dog, and you rotate it 90 degrees, so now it’s a dog on the side, the computer just goes like, “I don’t know what I’m looking at here.

Dr. Pamela Gay:         Right.

Frasier:                        What is this? Where is the dog? There’s no dogs in this picture.” And then, you rotate it back, and then like, “Oh, there’s the dog! I see it!” But those kinds of sort of image translations are being done now as a rudimentary part of this machine learning. So, they’re getting –

Dr. Pamela Gay:         Yes.

Frasier:                        – better and better. All of these issues that we’re talking about are getting better and better and better. And I think the reason we cued up this episode, this topic, is because I don’t know how it feels to you, but it feels to me like we are shifting from them being more of a pain than they are a help and us really wanting to have humans do the work, to a shifting into this hybrid model where they’re pretty good when you’ve got humans that know what they’re doing to train them, and now we’re getting some really amazing results out of these machine learning projects.

Dr. Pamela Gay:         And it’s that combinations of they’re finally giving us really good results. And with LSST coming, it’s kind of like machine learning is our Obi Wan Kenobi. It is our final hope.

Frasier:                        Yeah.

Dr. Pamela Gay:         And the diversity of problems that are being approached with it, we have not just the galaxies and the still-doesn’t-work-correctly craters, but we also have things like looking at light curves and finding exoplanets. We have figuring out comets and asteroids as transient phenomena, identifying supernovae. Space weather. We have this amazing suite of solar observatories coming online and online right now. And solar storms, as some of you in the past week have been able to see, generate amazing lightshows in our sky that can also do really bad things to our power grid and satellites in orbit.

Frasier:                        Right.

Dr. Pamela Gay:         And machine learning is helping us understand that space weather.

Frasier:                        Yeah, we had the auroras here last night. But they were hidden behind a cloud.

Dr. Pamela Gay:         Ugh.

Frasier:                        And so, we could see the green glow peeking through the clouds, and purple –

Dr. Pamela Gay:         That’s the worst.

Frasier:                        – stuff around it. Yeah, it was rough. So, let’s talk about this idea. So, you gave a great example. This idea of say, exoplanets, of –

Dr. Pamela Gay:         Mm-hmm.

Frasier:                        – recognizing exoplanets. And so, having a human being look at a light curve, having – we’ve always said that we can teach a 4-year-old how to recognize a crater, right?

Dr. Pamela Gay:         Yeah, yeah.

Frasier:                        But it’s really hard to get a 4-year-old to recognize the light curve of an exoplanet. So, how for example has machine learning been able to really make an impact on that kind of a research?

Dr. Pamela Gay:         The human eye is troubled by noise, and can both see false signals and noise and also miss actual signals and the noise. So, say you have data coming down from a star that has the sunspots on it, other things that may be causing noise, and is a particular light, but it also is a planet that’s going around and around within that noise. Well, the computer will have the ability to see – there is always this exact same-looking dip over time that is repeating over and over at the exact same interval that while it may not be that different from the noise, it’s always there. And that’s –

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         – a planet. It also just relieves all of the people who otherwise could be doing other science from having to look at data that isn’t perfectly clean. Classic algorithms, we’re completely fine going, “The star is constant brightness. It dipped X percent, and then became –

Frasier:                        Yeah.

Dr. Pamela Gay:         – constant brightness.” Classic algorithms. I could write that code in two days. To deal with messier systems, to deal with multiple planets, to deal with noise, you needed humans, and they were still imperfect. And the algorithms are still imperfect. But between both of them, we’re gonna find everything.

Frasier:                        Now, you sort of introed into this video just talking about the enormous amounts of data that are coming online. And it’s really hard to understate how significant this modern age of Astronomy, these giant observatories connected with huge fiber optic cables to the internet, and will just be disgorging their data onto the internet by the exabyte.

Dr. Pamela Gay:         Yeah.

Frasier:                        Like the Vera Rubin telescope.

Dr. Pamala Gay:         Yeah. It’s Vera Rubin Observatory doing the LSST, which is the name of the survey. And so, I can still say LSST. It will –

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         – always be LSST. So, LSST is going to be return petabytes of data per day as it’s –

Frasier:                        Per day?

Dr. Pamela Gay:         Yeah, yeah. Like Cisco Systems is designing special systems for this scope.

Frasier:                        Yeah.

Dr. Pamela Gay:         And they’re not entirely sure of the order of magnitude, but between tens of thousands and millions of transient phenomena. These are things that flicker, flare, pulse, move or otherwise are not constant in the dataset. And that’s going to be comets, asteroids, which were one of the major motivations for building this scope, supernovae, BL Lac objects, which are galaxies that flicker in brightness, regular pulsating stars, cataclysmic variables, –

Frasier:                        Planet 9.

Dr. Pamela Gay:         Planet 9 is my deepest hope. I need that world to exist. I –

Frasier:                        Yeah, yeah.

Dr. Pamela Gay:         – need that world. And with the machine learning algorithm, we can feed the system, “Okay, here are 100,000 objects that we already know about that we have tortured people with the Sloan Digital Sky Survey, with the Dark Energy Survey, with –

Frasier:                        Right.

Dr. Pamela Gay:         – all of the precursor surveys. Here, computer, is what these known objects look like.”

Frasier:                        Yeah. In terms of spectroscopic information, in terms of visual information, in –

Dr. Pamela Gay:         Yeah.

Frasier:                        – terms of time, domain, –

Dr. Pamela Gay:         Yeah.

Frasier:                        – how they move, how they change in brightness, all of that kind of information. Thanks, humans.”

Dr. Pamela Gay:         Thanks, humans. And then, one of the nice things about looking at the sky as opposed to looking at the surface of a planet is we’ve had for decades algorithms that go through and say, “Light, no light. Light, no light,” and can look at the distribution of the light. And all stars are point sources. And their light is smeared out in a way that is dictated by the optics, and if you’re using a planetary scope, by the atmosphere. All of the stars will have the exact same shape on your image. And anything that doesn’t have that shape, that’s going to be a galaxy, a nebula – occasionally Quasars can get misclassified, but LSST is so big, I’m not really worried about that –

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         – right now. And so, the software, the dumb software we’ve had for decades can go, “Light, no light,” and then go, “Star or not star,” and then the machine learning algorithms can look at all of the not stars and go, “Galaxy, planetary nebula, supernova remnant.” I’m waiting to see how they differentiate those too.

Frasier:                        Right. Asteroid, comet.

Dr. Pamela Gay:         And then, the things that it can’t classify within a given percentage of probability, those can get kicked out, and some of those will be completely new objects we’ve never seen before.

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         And once we’ve seen 100 of them, then we train the machine learning algorithm more. LSST is going to have so much data that we are going to be able to train machine learning algorithms with a sufficiently large dataset that we are going to be classifying things we don’t even know about at this –

Frasier:                        Yeah.

Dr. Pamela Gay:         – moment while we’re recording this episode.

Frasier:                        It’s funny to think about that, that it’s gonna be recording just so much data about the sky that it will have a giant record of a class of object that we didn’t even know exists, and –

Dr. Pamela Gay:         Yes.

Frasier:                        – some astronomer will look through it and go, “Huh. What’s this?” And then, the algorithm will go, “Oh, you’re into those things, huh? We didn’t know what they are, but here’s 10,000 more.” The astronomer will be like, “Oh, thank you, computer! Thank you! This is my career now.” Because it’s –

Dr. Pamela Gay:         Yeah.

Frasier:                        – gonna be crazy. It’s just gonna be mental. And there’s no way they do this without artificial intelligence, without machine learning that is gonna be grinding through these images one after the other, and just identifying everything that’s going on inside of them. It would be hopeless.

Dr. Pamela Gay:         But the other side of this is the citizen scientists or grad students or undergrads or whatever human beings are training those algorithms, they’re just gonna be casually looking through objects we’ve never seen –

Frasier:                        Yeah.

Dr. Pamela Gay:         – before, and helping us figure out what they are as the wetware that goes to our software that allows our hardware to have its full potential.

Frasier:                        Oh yeah. So, what do you think is the limit? I mean, again, you run a citizen science organization getting citizens to help with astronomy. Where do you see the future going over the next couple of years as these computers continue making leaps and bounds, and the data that they have access to is getting more complete and more machine learning-friendly?

Dr. Pamela Gay:         I don’t see a limit, because over time, what’s going to happen is those things that are very short-lived in the sky that are fairly rare are going to take years to track down and figure out. We’re just starting to figure out what fast radio bursts are.

Frasier:                        Mm-hmm.

Dr. Pamela Gay:         We are just starting to figure out that there are these weird snakes that are visible to MeerKAT in the center of our galaxy. And it’s going to take… well, the entirety of the human future to see all of the different things our universe has to offer us in statistically significant numbers.

Frasier:                        I guess it’s the rarity, right? I mean, –

Dr. Pamela Gay:         Yeah.

Frasier:                        – we’ve been chasing Pokémon’s or something. Or opening up Magic cards. We’ve got all of the common and the uncommons, and –

Dr. Pamela Gay:         And the mythic rares.

Frasier:                        – now we’re looking for the mythic rares. And they will be disgorged. But they’ll be these weird objects. And there’ll only be one every year. And astronomers –

Dr. Pamela Gay:         Yeah.

Frasier:                        – will figure out that, “Oh, that’s a situation where one star ate another star, and then they were consumed by a third star.”

Dr. Pamela Gay:         Yeah, yeah.

Frasier:                        And that’s how you get that really weird flash flash flash that comes together. But by analyzing those, we will make these fundamental understandings about how the universe works. “Oh, this might be why supermassive black holes got so big so early.”

Dr. Pamela Gay:         Mm-hmm.

Frasier:                        This might be how the first stars are able to form. This might be what the universe is really populated with in-between galaxies that the one-offs and the weirdos tell us about the science as a whole, which is fantastic.

Dr. Pamela Gay:         And the thing to keep in mind is 99 percent of the time that the software rejects something, it’s because you have a comet superimposed on top of a galaxy, and the software’s like, “I don’t know.” Or you have two galaxies superimposed on each other, or it’s gonna be something where it’s completely normal objects that are configured in a really weird way.

Frasier:                        Yeah.

Dr. Pamela Gay:         And human beings can look at that, and it’s like those optical illusion photos that it’s something normal, and your brain initially interprets it as weird. You can figure it out. But that small fraction of the time is the reason you keep doing this, ‘cause you might find… well, Hannys Voorwerp.

Frasier:                        So, don’t worry, humans. You still have a role in the field of citizen science Astronomy. But just don’t be surprised if your time is spent teaching a computer to do your job so that you can find even more interesting projects to teach even newer computers to do that job as well. I like it. I think –

Dr. Pamela Gay:         Yeah.

Frasier:                        – let’s let the computers do the digging through petabyte –

Dr. Pamela Gay:         The tedious part.

Frasier:                        – to find – yeah, the tedious work. Let’s do the fun part. Awesome. Well, thank you, Pamela!

Dr. Pamela Gay:         Thank you, Frasier! And now, I have to say thank you to so many of you out there! This show is made possible by you, and by the fact that you allow us to pay Rich, Nancy, Beth, Allie, all of the folks behind the scenes to herd us like the cats we are, and keep track of all of the craziness that we sometimes introduce into scheduling, and things like that.

This week, I wanna give a special thanks to Blixa the cat, Dean, Claudia Mastroianni, Connor, John Drake, Naila, Bart Flaherty, Brian Kilby, Corinne Dmitruk, Tim Gerrish, Arcticfox, Lew Zealand, Leigh Harborne, Mark Phillips, Kathleen Mattson, Bob the boodle cat, Chris Wheelwright, Jason Kardokus, Olivia Bryanne Zank, Ron Thorrsen, PAPA1062, Robert Hundl, Kim Barron, Vitaly, Paul Esposito, Jordan Turner, Arthur Latz-Hall, Frank Stuart, Ganesh Swaminathan, Bob Zatske, Ruben McCarthy, Geoff MacDonald, Iggy Hammick, Rebekkah, Gold, Kellianne and David Parker, Scott Briggs, Roland Warmerdam.

And if you too would like to be part of our amazing and vibrant Patreon community and potentially have me mispronounce your name, join us on patreon.com/astronomycast. And you can actually put in a pronunciation guide. All of you starting to do that, –

Frasier:                        Don’t!

Dr. Pamela Gay:         – I see you, I love you!

Frasier:                        Don’t do it! Don’t do it! I love it! All right.

Dr. Pamela Gay:         So, thank you!

Frasier:                        Thank you, everyone! We’ll see you next week!

Dr. Pamela Gay:         There are literally people putting in pronunciation guides, and they are my favorite humans. Buh-bye! AstronomyCast is a joint product of Universe Today and the Planetary Science Institute. AstronomyCast is released under a Creative Comments Attribution License. So, love it, share it, and remix it. But please, credit it to our hosts, Frasier Cain and Dr. Pamela Gay. You can get more information on today’s show topic on our website, astronomycast.com.

This episode was brought to you thanks to our generous patrons on Patreon. If you want to help keep this show going, please consider joining our community at patreon.com/astronomycast. Not only do you help us pay our producers a fair wage, you will also get special access to content right in your inbox, and invites to online events. We are so grateful to all of you who have joined our Patreon community already. Anyways, keep looking up! This has been AstronomyCast!

Back to Top

Follow along and learn more: