YTread Logo
YTread Logo

"Uptime 15,364 days - The Computers of Voyager" by Aaron Cummings

Feb 27, 2020
Good afternoon, my name is Aaron Cummings. I'm absolutely thrilled to be here at Strange Loop to give a little talk about Voyager. Now Voyager is a pair of twin space probes that launched in 1977. They are still out there returning data from At the very edges of our solar system, the beginnings of interstellar space, you can access the Propulsion Laboratories website jet from JPL and actually see where Voyager is. At this time, the Voyager spacecraft has so far completed its initial goal, which was to visit Jupiter and Saturn, the mission was then extended for Voyager 2 to visit Uranus and Neptune and now, 42 years after launch, they are still continuing, still They operate, we still receive telemetry, they still respond to commands, this is the longest mission.
uptime 15 364 days   the computers of voyager by aaron cummings
The running mission that NASA has had has been running for so long that it used to look like this. This was the logo. It was new when Voyager launched, so in this talk I'll talk about the origins of Voyager. Mission, what things allowed these missions to be successful and also what has contributed to the longevity of the spacecraft and what comes next, this is really about engineering, systems and trade-offs. I'm not going to talk too much about the science. of Voyager as to why they talk about Voyager, well this is in the late 70s and early 80s, you see pictures in the newspapers of what Voyager shows us and this is just incredible, I mean, This is what really captures the imagination.
uptime 15 364 days   the computers of voyager by aaron cummings

More Interesting Facts About,

uptime 15 364 days the computers of voyager by aaron cummings...

I'm just a guy, my daily job is programming and electronics. I have no connection to NASA or JPL or the Voyager program, so I don't speak for them. No opinion that may appear in this talk is our own. and not those from NASA or JPL or my employer, so in early 1964, a JPL engineer, Gary Flandreau, had proposed a mission, a grand tour to the outer planets, and this was something that was made possible by an alignment. of the four outer planets Jupiter Saturn Uranus and Neptune not that they were in a straight line but a good position so that you could use a gravitational assist maneuver around one to get to the next and this allowed you to reach Neptune in 12 years instead of let's say 30 if you had tried to go there, uh, in one go, so I think the first thing we'll say is something about opportunity, if the planets literally align, you might want to take advantage of that situation, so let's build. a deep space probe we are in space because we want to learn about it we need some instruments that will be here we need some way to power those instruments we need to get the data back from the probe so we need a communication system we need navigation propulsion stabilization to keep the thing pointed in the right direction, it needs to run for a decade or more without having any maintenance and we need to be able to control this thing in non-real time because of the speed of shining space probes very far away biting seems like it's very fast but I'm not going to invoke the Admiral Grace Hopper for this.
uptime 15 364 days   the computers of voyager by aaron cummings
One way to visualize the speed of light is to take something, a piece of wire, maybe this cord is what this is about. a foot long, about 30 centimeters, this is a nanosecond and to get from here to Voyager is many nanoseconds on the order of 40 hours, a round trip to get to Voyager One and back these

days

, so there was a program They met at JPL to explore what it would take to have a deep space probe and they call it the outer planets thermoelectric spacecraft, also known as tops, and it had several novel technologies, one of them was the computer self-test system and repair that this If you knew that there are triple redundant components, there was a core that had five times redundancy and voting logic.
uptime 15 364 days   the computers of voyager by aaron cummings
You know it's better two out of three to win in case there is a glitch and I guess if one of the

computers

lied you would get Voted out of the spaceship, you also need some way to generate power. You're so far from the Sun that you can't build a solar panel and take it into space that's big enough to generate the hundreds of watts you need. Chemical solutions. You know fuel cells might be fine for a short mission, but not for something that lasts a decade, so you contact the US Department of Energy and get something with a little more plutonium, it's glowing because it's hot, Plutonium-238 specifically, is an alpha.
The emitter has a half-life of approximately 88 years. It is an excellent heat source. You get about half a watt per gram continuously and spontaneously, so you take this plutonium and use it to create the mhw, the 100-watt multiplex radiothermolytic. generator that looks like this. I promise this is not a Doctor Who prop. This is a picture of the actual RTG unit that was being used on Voyager. Inside you have plutonium spheres that are then surrounded by a set of collectors. These are thermocouples. It's a solid state device that can use a temperature differential and then take that temperature differential across the junction, convert it directly into direct current into electricity and one traveler had enough of these units, three of them like this to power a few 450 watts at launch, which ends up using 13 and a half kilograms of plutonium on top of a rocket, what could go wrong?
Well, there was an intensive security program because nobody wants this thing to land in their house. The probe needed power, the RTG needs power. uh, to survive any kind of launch accident, it also has a heat shield to survive re-entry into the atmosphere and I just hope all of that actually works in case something goes wrong with one of these one day, like The did Grand Tour. we managed to get presidential endorsement President Nixon in March 1970 issued a statement on space missions and said that we should press ahead with bold exploration of the outer planets over the next decade we will launch Grand Tour missions to study the mysterious outer planets of the solar system Jupiter Saturn Uranus and Neptune and Pluto Preparations for this program begin in 1972.
While in December 1971, due to the cost of the Vietnam War, some other things were happening in NASA's budget, including the shuttle program space that was just beginning. The Grand Tour was cut from NASA's Nash budget, which is the next lesson: CEO backing is no guarantee that backing is sometimes not enough and this could be the end of the story, except it would have too much time for questions, so with an apology to Paul Harvey, I will now tell you the rest of the story. The Grand Tour was an ambitious and extremely expensive program, so the engineers went back and within 10

days

proposed a new program of reduced scope.
Simpler space probe that builds on existing space-tested hardware and expands it to visit just two planets, Jupiter and Saturn, and that was called the Mariner Jupiter Saturn 77 Probe. I like to think of this as the MVP or the Minimum Viable Probe. , which is the key to this. It's cheap enough to get through Congress, so in May 1972 the funding is approved and eventually the mission is renamed Voyager and then it's stated that the Voyager mission will last five years, that's the operational life, The design life of the Voyager spacecraft was five years, so it should have expired in 1982, here we are in 2019.
It's still working. How come Ron Draper was a systems engineer at JPL and he had this concept called The X Factor, which was a pretty broad concept that effectively doesn't make any engineering decisions? that could limit the useful life of the spacecraft, do not make a decision that will limit the freedom of further action or, as Donald Knuth would say, premature optimization is the root of all evil, so the three R's of What makes a spacecraft last so long First you have reliability, you need reliable components, tight design and manufacturing of the units qualify for any component source that comes in and they have a selection program, but sometimes components still fail so that needs redundancy in these components, has spare parts available. his system for troubleshooting component failures and the Voyager program had redundant

computers

redundant navigation redundant communication redundant thrusters he even had redundant spacecraft he had Voyager 1 Voyager 2 on two separate trajectories with different risk profiles for them the third R is the reconfigurability that You want to have You can take advantage of redundancy if you need it and you can't anticipate everything in advance, of course, but once the thing is launched, you can't go out and fix it, so you need some way to remotely reconfigure your spaceship, maybe to do things you hadn't anticipated beforehand now it's 1973 we're in the middle of designing the Voyager probes we hadn't been to Jupiter yet and there was a surprise waiting for us there that was discovered by Pioneer 10.
This is a simpler earlier probe that happened near Jupiter in December 1973. And what Pioneer showed us was that there were incredibly dense radiation belts ionizing enough radiation around Jupiter to endanger spacecraft that the original Voyager design didn't have. any radiation hardening, so a program to add shielding was started. A very careful selection of the components was made. The qualification of those components was reviewed to ensure that they were relatively immune to radiation, and intensive analysis of the worst-case circuits was also performed. We are sacrificing performance to have additional design margin in case it suffers any radiation damage.
The project remained on schedule, but it cost $150 million to do this radiation hardening. Fortunately, it was funded and we went ahead and finished. Let's go with this, here is the Voyager spacecraft, all together, there are 11 scientific instruments here plus its power, navigation and communications propulsion systems and there are three computer systems that tie all this together and the systems have these cryptic acronyms ccs aacs and fds and I'll go through these, um, one at a time, the command computer system, this system is always on and this is actually the system that's in charge, it's linked directly to the power system, it controls the entire sequence of all the functions. main functions of the spacecraft and those main functions are things like the power system that keeps the temperature inside, you know, not too hot and not too cold, the attitude and navigation commands are passed through the CCS, any instrument that needs to be turned on or off and also the data system configuration to send things back through the downlink which is also managed through the CCS.
This is an interrupt driven machine that depends on timer events coming from other subsystems and will also respond to any commands coming from the link from Mission Control, the memories are divided into two parts, about half are fixed procedures that do not normally change , these are cleanup items as well as any kind of built in failsafe in case communications are lost and I can no longer receive commands now this is all Ram so everything is reconfigurable there is nothing that is hardwired into the system as far as uplink procedures go during a cruise which is usually about once a month and you would have something like that happen during the planetary rendezvous, there is a lot more going on and you might see this happen every 18 24 perhaps 30 hours in which a new procedural platform would be loaded onto Voyager.
Voyager also has autonomous fault detection, it has fixed procedures in place to detect and respond to any faults that may occur if something goes wrong, it starts swapping out those redundant components until things start working right again. In fact, some people have compared the CCs on Voyager to Hal. Fortunately for us, CCs are not paranoid. and is very good at listening to instructions and is also under no illusions that all its circuits are working perfectly, so the hardware design of the CCS on Voyager is almost identical to an earlier design already tested in space that was used on the Viking orbiters . which were launched to Mars in 1975. the computer is these computers is a dual redundant custom processor this was designed in the jet propulsion laboratory and for redundancy there are two logic units there are two output units there are two memories there are two power supplies there are uh there are two, everything and all of these systems can be linked together in the event of a failure architecturally, these machines have 18-bit words, those 18-bit words are divided into a six-bit opcode and a 12-bit opcode. address is a direct addressing machine, all data is maintained as 18-bit integers in two's complement and there are 13 registers, including theaccumulator and the program counter, and there is a status register, etc., this is mainly due to interrupts and there are 32 interrupts that are available to receive messages from command systems or from other subsystems in the spacecraft, it is built with bipolar TTL logic, it has an instruction cycle of 88 microseconds, so it is approximately 11,000 instructions per second and memory access to reduce the size of the bus and the number of components on board requires four cycles, five cycles on this machine to get the data out of memory because it only takes four bits at a time, the memory itself is made of what they call coated wire memory.
In principle, it is very similar to magnetic core memory, except there are no cores. It is actually a magnetic coating that is on the cable. The bits are stored there as magnetic fields. It is not vital. The tabs are non-destructive and largely immune to uh. Radiation is not fast, but that's not really a design requirement for this. Programming in Voyager this is a pretty crude environment there is no operating system I think it was all done in assembly language there are a couple of references that talk about Fortran I haven't I was able to convince myself that the CCS was actually programmed in Fortran um but maybe it was the second system, it's the articulation and, sorry, the attitude and articulation control system, this is a separate system, it relies heavily on The same hardware as the CCS, this is the system that keeps the spacecraft pointing in the right direction. and operates the instruments that are on the scanning platform, this access arm that hangs off the side that holds the imaging instruments, the CC is really what to do, the aacs handles the how to do it like As far as how to operate the servos and how fast to move things, how to run the thrusters, etc., the main job of the AACs is to take that giant dish you know, a three and a half and four meter dish in the front, keep it pointed back. on Earth so we can keep communicating with this thing, the only time it doesn't is in case of a mid-course correction or maybe they need to do something for Images where the dish is in the way and then it has to be We can retrieve that so we can reestablish communications.
Now the AAC is also two redundant computers, only one of them will be on at a time, as I said the hardware was the same as the CCS except there was actually one more. The component was a piece of logic that was inserted between the processor and the memory to give it indexed addressing modes that then gave it the ability to reduce, you know, some duplicate code that you might have on this processor. this processor has a number of interesting peripherals to help you stay focused and you know to take the actions that you need to take and it's actually quite different from the data systems that we might have had in previous tests that we used to do this completely.
With analog circuits on Voyager we had a digital computer which had a big advantage because now you can change things after launch, if there is some kind of problem where you need to improve the system, the capability is there to do it, including work. around any of these, if any of these peripherals fail, you want to be able to program the machine or have it respond so you can deal with the fact that the third main system is the flight data system, this is what actually manages the formatting of TheCollection. transmission and storage of any data that is collected and the data in this case is really of three types, there is data that comes from the instruments themselves, there is data related to images that give us those great images that we get and then there is the so-called engineering. data that actually refers to the health of the spacecraft itself.
Now this machine was not like the other two. It was a completely new design. They needed something that was fast enough to handle the volumes of data they were looking to process. Now this was based on CMOS metal oxide semiconductors. This was the first time a CMOS computer went to space. This is a 16 bit cereal bite word. Cycle time of two and a half microseconds. It takes four cycles to search and then one more cycle. to actually execute the instruction, so you get a cycle of a new instruction every 12 and a half microseconds, so that's 80,000 instructions per second that can be executed here.
The memory is not silver wire memory, it is also CMOS memory in this, uh, in this system. faster but it's volatile and very susceptible to radiation, so as part of the radiation hardening process, just a ton of shielding, not literally a ton, but a huge amount of shielding was put around this system, something Which I didn't know until I researched. This is when Voyager is running and you're reviewing your images and your measurements during an encounter, it's retaining and returning most of that data over the downlink in real time because the amount of storage on Voyager is so limited that you really want be able to take advantage of it. everything as things are being measured and understand it right away, this means no retries, which means you can't really tolerate a lot of errors, but you're talking about something that's very far away, it's a noisy channel and it needs have some way of isolating yourself against errors that arise due to noise or whatever, so there is a form of forward error correction, it's called golay coding, this is basically a type of parity that is being done every 12 bits of data, then they are joined together with 12 bits of parity, which gives you the ability in each of those 12-bit blocks to recover from three reversed bits and that was able to increase the reliability of the transmission makes the error rate be low enough to get to an acceptable level, but there is a penalty for that, which is doubling the amount of data you need to send, but statistically there is a trade-off, so you can, you can. run a faster data rate because the total error rate is going to be low enough now, sometimes you need to record things and they get recorded on a tape, not on that tape, it was an eight track tape, not on that tape eight track, it was this eight track tape, so Voyager has this tape drive if for some reason it can't transmit the live streamed data, what it's going to do is dump that data onto the tape, it could be because of something they have to make Imaging, maybe you're behind a moon or a planet, this tape is about a thousand feet long and contains 500 megabits, which is enough for about a hundred full-resolution images from the Voyager Imaging System and then the tape is rewound and played back to, uh, when we are back in contact with the ground stations.
It turns out that Voyager Zero actually has three spaceships built, not just two. The first one that was delivered was an engineering prototype and if you want to see that it is now in the National Air and Space Museum. uh, this engineering prototype was especially useful when you're trying to troubleshoot between three systems, since I mean there aren't many of these, you can't just go stack Overflow and ask what's going on with it, so. You can fix the problem by seeing how it works in the other system and this actually happened several times in the period of a couple of months leading up to launch after the spacecraft had been sent, so we're finally there. on the launch designs, all done Voyager 2 is first launched on August 20, 1977 a couple of weeks after the launch of Voyager 1, that's September 5, 1977.
Voyager 1 is on a path faster and therefore it will reach Jupiter first now, after the launch, there were some problems in jet propulsion laboratory in the management of the mission there was an incident where, and I think it was because the engineers were being assigned to other tasks and were directly distracted by the ongoing work on the Galileo program, which was a continuation of Voyager, there was a command who was supposed to be sent to Voyager 2 because he needed to talk to her every week and someone forgot to send that command, so the autonomous failover protection on the spacecraft said, "Well, I haven't heard from Earth in a while." Timed out in 168 hours let's switch to the other receiver something must have gone wrong run the lost command routine so switch to the secondary receiver secondary receiver turn it on there is a shorted capacitor in the phase locked loop of this receiver the receiver still it works but not very well, it no longer tracks frequencies very well, so a command is then sent to Voyager to return to the main receiver.
That receiver runs for 17 minutes and experiences a short circuit and the power supply blows the fuses. Now he needs to wait for another one. 168 hours for the spacecraft to hear you again after the lost command procedure runs a second time switch back to the secondary receiver we are still running this receiver which doesn't hear very well, is very sensitive to the temperature in the spacecraft and Doppler shift and so on, so the deep space network people with the giant satellite dishes that are used to communicate with these probes have a procedure where they scan a series of frequencies to try to figure out which Voyager 2 is listening best so I think the lesson here is don't neglect your project after launch.
I think this was kind of a wake-up call to the JPL management that did the project. back to normal and when we got Jupiter, things were actually in much better shape, so Jupiter, Saturn, the moon Titan for Voyager One, all those encounters were great, same for Voyager 2. After the encounter with Titan, Voyager 1. was sent north outside the ecliptic, it will never find another plant and planet in our solar system, Voyager 2, although based on the success of the program so far they managed to get more funding and that continued with Uranus and Neptune , but there is a looming problem as you get closer to Uranus is that the further you get, the weaker the signal becomes, the weaker the signal, the less bandwidth you have, you only have one chance to capture this data because you don't have We're talking about the spacecraft, so if you want to get all the images you want and all the other science data, you need to have some way to increase the bandwidth of that, so what was done there was implement a data compression algorithm. this was not part of the original design at all this is used to compress the images normally that flight data system only one of those systems would be working now we are using both in parallel one to process the general scientific data and the other is now processing the image data and compressing it, you get a two to one compression ratio and they also incorporated a Reed Solomon encoder, which is a different type of forward error correction system, uh, Reed Solomon has less overhead, uh, in compared to uh, to go put, it's about 15 bit overhead, but again, that was able to provide reliable enough communications to get enough images and enough data from the spacecraft now that this is getting further away every once more of us.
We're finding that, uh, Voyager, we're letting you know that it's getting harder to communicate with, but we can adjust the data rates and keep it, of course, the hydrazine fuel used for the thrusters is sufficient. That will probably last a couple more decades, but really what will be the limiting factor on this probe is power. The radiothermolytic generator is suffering from two effects. One of them is the half-life of plutonium. Right now we know that half of the first half-life of that plutonium has passed, but the other thing that happens is that those thermocouples eventually degrade and you get less energy out of those thermocouples, so what we've been able to do is keep everything working turn off the instruments, turn the heaters on and off to try to modulate the amount of power we have on the spacecraft and I mean, it's really estimated that we'll only be able to operate this for probably the next five years. or so, at which point the spacecraft will go silent, but it has already reached solar escape velocity, so they will not return, they will continue to go out into the afterlife, maybe they will land somewhere in the galaxy and in the Voyager is.
This golden record is a recording of images, sounds and greetings from the earth, who knows this, it could be found one of these days and someone will know how to decipher this and receive a little message, uh, from here, uh, from us here in the earth. That's my talk. I have a couple of minutes for questions, so go ahead and shoot, thank you, go ahead. Okay, the camera specs, this was a vidicon tube based camera, which is a vacuum tube. camera based on something that would have been on normal television cameras in the 60s or 70s at the time, this was before digital sensorswere actually available, it's capable of, it's an 800 by 800 pixel square frame, so you get a 640 megapixel image, eight gray levels, there was no native color capability of the tube itself, instead, There was a color wheel that was in front of the camera and that was circulating.
I think they had eight different filters, um colors, plus some things like methane to try to make some images right there um, I haven't, the question was about how much computer redundancy has been used. I think, fortunately, not much. I'm not aware of a computer system having completely flipped there, there have been some issues with single bit failures in some of the CMOS memories in the flight data system. I know they've been programming something to try to fix those single-bit failures, but so far as a full Wipeout, I'm not aware of one here. Well, the question was about the team currently working on Voyager.
I saw a number, and this number is actually a couple years old. I think there were around 14 people working. Voyager, I'm not sure if that includes support from the deep space network that's running all the communications, but it's a pretty small team these days, back there, okay, that question was, are we waiting? Communications fail or hardware fails as far as communications are concerned. The flight data system is actually configurable to run lower and lower data rates, so I think we can at least keep getting things back. There's really no more high speed data coming from Voyager.
Because the imaging platform was shut down 30 years ago, I think the only unit everyone is looking at and trying to manage is the thermolytic radio generator and power systems, here same, I have that on the first slide. Actually, can I go back there? You know, I'm not going to look for it. I think it's around 40 hours. I couldn't get much information on that right there. I know they have prototype systems. in the lab basically a copy of all these systems, but as far as a software engineering approach goes, I couldn't find much that was written on that, unfortunately right there, okay then. the original mission and of course these are about 1970 dollars, it was between 750 and 900 million dollars, the reduced mission, it was originally funded, it was about 250 million, as far as affecting the hardware, it meant that things like the flagship computer that engineers thought was a really elegant solution that ended up being thrown overboard and then they ended up using other hardware they originally had.
I think the final cost of the show eventually grew because the show had expanded and probably cost around $900 million by the time we get to it now, um anyway, I mean a question about the extension, you know, the Grand Tour, I mean Uranus and Neptune, Voyager 2 went there. I haven't been back since, I think the other really deep space mission, you know, a couple of years ago, New Horizons, visited Pluto and you know it was something that really you know it's fantastic to see because it was kind of like the The Rebirth of get something you could never otherwise get, um, but I understand your questions well enough, thanks, right there, I shouldn't have put it together yet, um, but I'll try to post it on my GitHub, oh, um, I.
I have a lot of things I've printed like this. I just need to post all the urls. I have a couple more here. Okay, this was a question about the physical failure mode of thermocouples. I didn't try to unwind the semiconductor physics, um, behind that, unfortunately, so I don't understand it yet, okay, right here, um, I think it's every two months, this is the frequency of maintenance updates for Voyager , there is not much maintenance to do. because, I mean, the measurements are continuous and they're pretty static, it's really whatever management they want to do of the energy system, but I think every two three months they're doing an uplink. that was one over there, ah okay, are there alternatives to an RTG?
Something that will last a couple of decades and provide this amount of energy for something that is so far from the Sun. I don't think there is one like that. I mean, even on the missions we're launching now, Cassini and New Horizons, they were also using rtgs, so unfortunately I don't think there's a better alternative yet. Here, the question was the division of time in the deep space network. I know they have a schedule, I mean the team working on Voyager. I imagine part of your job is to manage the sequence of when data will be transmitted, so I'm sure those resources are managed, but I.
I know they can't just sit there and listen to it all the time, one quicker one, right here, you know there's some security on the date, whoever has a big antenna can talk about yeah, security, I couldn't find any information on the. security, meaning there is none or it is classified. I'm not sure what's right. I ran out of time. This is great, thank you very much.

If you have any copyright issue, please Contact