Raymond Hettinger, Keynote on Concurrency, PyBay 2017

Jun 09, 2021

Hello, my name is Raymond, nice to meet you. I have a mission in life to train thousands and thousands of Python programmers. I've done it for probably about forty-eight weeks. a year for the last six years and I have generated a huge amount of programmers if you want to contact me for training or anything else here is my email address cleverly obfuscated and my training company is mine mutable for those of you who have access to Safari Books. I have a number of videos online that are also available for free, which is a great price. It is approximately twelve hours of training.

The only way my wife knows if I've done anything good here is if everyone tweets during this. I do something. Well, go ahead and tweet him and tell him that Raymond H did something right, so my talk is about

concurrency

and I think it's an interesting topic, the topics are becoming more important over time and I have a lot of advice for you. how to use

concurrency

and what your options are and I just want to share that with you so that's my plan so the introduction is. By the way, I'll post all these slides right after this, all I'll give.

More Interesting Facts About,

raymond hettinger keynote on concurrency pybay 2017...

You have a link and you can have all this because it has a lot of notes. It's like a pretty detailed and pretty technical

keynote

, so essentially this part here is a woman talking about our goal, why the hell would she want simultaneity and not

keynote

? is complete without calling other martellis and something I learned from Alex, so he is here from time to time talking about the hated global interpreter crash, which I think is somewhat irrelevant and not something to be hated on, so we will have a little bit of a battle between threads versus processes and then a little battle between threads versus a synchronization and the goal is that by the end of this page you have a pretty good idea of when Wendy's processes are uploaded.

Wendy is a sink and the advantages and disadvantages of each. that sounds like a worthwhile goal, very achievable and then if you enjoy it, if you respond very well, I have examples and I can run code for you, including the incredibly dangerous live examples where I have a bunch of code sets . I have a bunch of scripts and nothing can go wrong with the live demo. Wish me luck, so I'll change slide mode real quick. What we are going to do is look at some examples of concurrency using threads. multiple and asynchronous processing, the idea is to familiarize you with each of them, show the rules and best practices for each, which raises the question of why concurrency, by the way, at the end we will have a very short question and we will answer in a session what I would like to do is answer all your questions and we will spend about a minute on it.

What is that possible concurrence like? I'll have everyone ask their questions at the same time and then I'll try to answer them all. At the same time, how old is that? We're going to saturate your CPU, okay, not your CPU, but the server itself, okay, those are also the limits of what concurrency can do for you. In the end, you have a certain number of clock cycles to go around. and you can spend all those skål clock cycles servicing requests when the excess quest received exceeds the number of clock cycles you have. Concurrency can't help you anymore, it doesn't provide more processing power.

Concurrency is about taking advantage of the computing power you have, so why do concurrency right? Improves perceived responsiveness. I have two people who wanted to ask a question, they line up nicely behind the microphone, one person asked a question and then another person asked a question, the second person feels like they have to wait for the first person and they get no response, so if you can ask both questions at the same time, you will get perceived responsiveness as well to improve speed. II mainly, we now get additional speed when using multiple processes. We're leveraging multiple cores and that way we're actually throwing more clock cycles at the problem and there are certain categories of problems that benefit from this and lastly there's another reason to think about concurrency.

I never thought about this idea until I had read the book The Pragmatic Programmer. I thought that concurrency is a last resort, it's something you do when necessary, but in The Pragmatic Programmer Kent Beck pointed out that the real world is concurrent. Things are happening right now. I was going to check the news about something that is happening in the world somewhere simultaneously and now when you try to do a project you have to coordinate with many people, they are all working at the same time, the real world works this way. and we want our computing systems to model the real world as best as possible, so people who live in a single-process, single-threaded world, without asynchrony, are not modeling the real world at all, they are modeling a world simplified that it was a world of three. good reasons for concurrency who wants concurrency now yeah okay so what did uncle alex have to teach me about scalability?

Years ago, he started working at a small company called Google that had a few computers and was apparently tasked with serving a lot of users simultaneously and came away with some thoughts about scalability. There are three types of problems in the world. There are problems that can be solved with a core. Those problems don't sound interesting, but there is one. Cornell is a thousand times more powerful. than in the mid 1980s, so a kernel can do a ton, so this didn't used to be a very large category of things that can be done, but in fact, you can run tensorflow on a machine and train it on a machine. a short period of time to read handwriting or do speech recognition or all that and the tensorflow demos make it look remarkably easy.

Is it amazing what you can do with just a small core? Yes, but the data is getting bigger and we want to serve more customers. Another category of problem you can solve with 2 to 8 cores. 2 to 8 is that I have multiple cores on my machine and they are hyper-threaded so effectively that I have 8 threads. The Ripper came out recently and it's pretty fresh for everyone and his brother. it's going to run 12 cores, 16 cores and 32 cores so there's going to be a lot of cores coming so Alex thought about this and said well you can use threads or you can use our processes as long as your problem fits here so let's say my machine that has 8 cores as a limit.

I have a problem that only requires 7 cores with computing power. Can I use this machine? Can I use multiple threads and multiple processes? In fact, I can't because I have enough computing power, but then Ellen thought: "I'm so close to the limits of my machine that if my problem grows by 20%, I'm suddenly out of this range and will need more than 8 cores, so your thinking is if you have a problem in this range, you happen to be lucky just at that moment when your luck runs out, you will really wish you had jumped here to distributed processing and Google's method, hundreds of thousands or millions Of course, at the same time, by the way, do you have to work with Google to get access to that kind of computing power?

Now you can access Amazon Lambda that will allow you to run their program on a hundred. thousand cores in parallel and even Google will sell you access to that kind of computing power, so what Alex had to teach me is that this is a very big area, there are so many cool things you can do on one core and other problems, this is a very rich set and his thought that he recommended to me at one point was that mainly you don't want to work in this space because you happen to be temporarily, it's too hard for a core, but I could do it in less than 8, which the problem becomes a bit , it will completely overcome a problem that we need to scale, and as time goes on, the second category becomes less common and less relevant, but the data sets grow, that is, even.

When you move to distributed processing, what you would like to do is make the most of the cores you have; you don't actually want to spread across 100,000 machines, but only use 1/100th of their power, so this is actually still an area. It's even interesting that the problem is going to be bigger than that, in fact, distributive processing basically does a lot of this on a bunch of other machines. Do you all agree that category 2 is interesting but don't want to limit yourself to okay in every story there has to be a villain Darth Vader boom boom boom boom boom boom okay who likes global interpreter lock oh me too I would also really like the global interpreter lock because if you are going to have to have locks, which would be better, a simple lock that covers all cases, that clarifies the rest of your codes, there are thousands of small locks that can break down individually and they are expensive to require individually, and melis, so our friend Larry Hastings is working on a project called Gila ectomy to remove the global lock from the Python interpreter.

Do you think it is a difficult project? It is an incorrect hypothesis. It takes about a day of work to remove the Gil. He had eliminated Gil the first time. The day I fixed it there was no problem, the problem then becomes all the Velux you have to put in all the other places to get Python to work. Turns out it's not particularly difficult either and a few days later he did it, so Gil has been replaced by many other little locks on smaller data structures. The problem solved any questions. Oh, our luck is expensive to acquire and melis, in fact, are, so the good news is that Gil is gone, free threads and a dozen times. slower than a normal Python, so you actually get a reward for Gil and the reward is that you don't pay the full performance cost of all these individual LOC acquisitions and our releases.

It's actually a really good thing to have on the road. We all have free threads, but we have ways to solve that problem. If you can't completely free a Python thread, why don't I run eight Pythons in parallel, each with their own threads? Then there is no problem. I'm leveraging all threads. other cores or you can combine threading and multi-processing, there are several ways to do it, in fact at some point most people just get over Python having a global interpreter lock, go ahead and saturate all eight cores to one hundred percent and take advantage at most another machine just ignores the problem there are many ways to ignore the global interpreter lock it's no big deal to anyone except Larry Hastings Larry hates Gil, he hates him with a passion, no one in this room hates him with a passion passion, but he He does because none of you are like him.

Larry likes to game and Garret the Larry likes to hang out with a gaming community and Python is not popular in the gaming community because the gaming community is about I have a really powerful computer. with a lot of cores and the more cores you can throw at a problem, the more likely you are to be able to have clear video and get a clear headshot or whatever is important to a gamer, so gamers should take a system and get whatever. every possible clock cycle can do it, so they love threading and Python, when you do it, threading doesn't take advantage of your multiple cores, therefore Python is not popular in the gaming community and that is the section that the world closed us down and your thinking is if you can remove the blame and keep the performance going, we can get that part of the community back, so it's actually a noble endeavor and probably something worth exploring in the meantime, the rest of us we don't have that problem and because I don't have that problem, that's my opinion, guilt is not important to you, do you agree how many of you screw up your job every day because the gill is in your way, so omg , if they removed the gill, basically it almost never comes?

Up, okay, there we go, another familiar face, so I'm not going to say that global interpreter blocking isn't a problem, but I would just say that it has some advantages and disadvantages and that removing it currently has quite a few cars for us, so we have to learn to live with all that right now. A little about human resources. You are engineers. They don't know anything about human resources. I know everything about human resources. I dated someone who worked in human resources. and I learned HR stuff while learning how to hack a computer and bend it to my will.

There are other conferences learning how to hack you and just as I know how to put a computer in an infinite loop, they know how to do it. In fact, one day I was having a personnel problem and I thought about discussing it with HR, I mean my girlfriend and I said to Ellen, this is my situation, what should I do? And she pulled out a little HR infinite loop voodoo and hit me. With this, Raymond, don't you know that your weakness is yourstrength and your strength is your weakness? That is not actionable. What I do?

What I do? Do I get stronger? Am I weak or do I have rigor? What do they learn? They are at conferences right now learning more about these. I have a long list, just full medical and it's all over, anyway, of course, that has nothing to do with engineering. Greeting. I would like to move on to threads versus process. What is the threat of thread strength? The strength of threads makes the wonderful thing about them is that they have shared state and because we have shared state, it is easy for one thread to write to a chunk of memory and another thread to read it back without any overhead in communication costs. , it's not that amazing. thread shared state weakness because we have a shared state we now have race conditions, in fact if you have a multi threaded program and you don't have a race condition, you probably didn't need threads to start with the goal of having threads with shared state if they have an economic communication cost with that shared state, which means you have to put locks on it, so in fact Ellen was right: your weakness is your strength and your strength is their weakness, that the strength of the threads, shared state, makes it run very fast.

The weakness of threads is that it makes it very, very difficult to fix them, so let's talk about the processes that make what the strength of the processes are, they are completely independent of each other. others don't have shared memory and that makes it easy to kill a process without killing another process which is really nice and you don't have to put locks on because they never step on each other and there are no race conditions inside them. that is a wonderful strength of the processes what is the weakness of the processes the weakness of the processes is because they are independent they do not have a shared state because they do not have a shared state if two processes are going to talk to each other, they have to take the objects, pickling them, moving them through a raw socket or some other means of transportation and undoing them on the other side so that they have a huge communication cost compared to the threat.

LM was right when it comes to threads and processes, this is your weakness. it's your strength and your strength is your weakness who learned something new okay async who's excited about async no one used to be excited about async every once in a while one person read a book about twisted and the team used some twisted to speed up other code on their servers but I didn't master Python conferences, there was always a twisted speaker - a twisted book or - and a team or two using twisted tornado and then open source from Facebook and there was a little more interest and then Cuido woke up one night and said poor thing, what do I want to do with the rest of my life?

I know I want to live it asynchronously and suddenly Cuido became deeply interested in asynchronous I/O. Now, what's the logical thing to do if you want to program a computer, well, you should go out and learn a computer programming language to be able to do it. A programmer sounds reasonable and most reasonable people do that, but what if you are an unreasonable person? An unreasonable person looks around and says there are hundreds of programming languages and none. Many of them are right, they don't express ideas very clearly. I don't believe in them. Nobody else has done it well.

That is a very unreasonable position and in response to all these fools using other programming languages, you invent your own. Irrational people invent. All the progress of your own programming languages depends on these irrational people, so what's the reasonable thing to do if you're interested in asynchronous programming? Well, the reasonable thing to do is to download Twisted or Tornado Army, which are one of the many asynchronous packages that exist. there, that are well tested, well documented, broken, it is a reasonable thing to do, what would an unreasonable person do? I mean, what did I care? It's called thinking about i/o from scratch, building one from first principles and therefore because you got interested in it. has become a key part of Python, it is now part of Python three, five and three six asynchronous I/O and is starting to permeate the rest of the language, generating new keywords and thinking of a way to support these tools .

Oh, and because Cuido became interested in Suddenly, everyone else woke up. Sit down. Oh, this must be great. I should do it too. Do you need a sink? Are you interested in something new? Is that so? I also share that interest and many people are interested in a sink, but are not sure. What is the difference between the threads? So I'd like to give you a little model so you can talk about a sink. This will be the if you know what's in the next few lines, about 200 words, if you know these 200. words, you'll be the cool person at every Silicon Valley party, that's it, yeah, you know about threads versus thinking, and a Lots of people will gather around here for a drink, tell me more, so this is what you need to become popular in Silicon.

The Valley parts would say John tells me about the threads and John would say you know the threads, which preemptively means that the system decides for you when to switch tasks. This is great and very convenient because you don't need to add any explicit code to your code. The code will just be running and suddenly there will be a task switch and you won't have to do it yourself and the world is magically concurrent. Did you paint a really good picture? In fact, you basically get free concurrency with us. just say something it was doing before running it in the thread and poof, now it's happening in parallel.

It's actually not much harder than that for everyone watching the threads. It's almost trivially easy because it's preventative. Preventive means you are right. In the middle of doing something and then someone else, the thread manager decides to switch to another thread and then goes back and turns it on. This is great because the programmer has to do very, very little to activate this. Is there a cost for this convenience? This is because you can be interrupted at any time, you have to assume that for us, which can happen at any time, so if you are trying to organize things well so that two things have to be consistent with each other, I will update this variable and Those variables that have to be equal to each other, your problem is that if you update one and get ahead of it, the other may not be updated and it will leave the system in an inconsistent state.

In fact, that is the reason for the global interpreter lock in Python. As you run your Python program the global state is constantly updated what task is running, what line number it is on, what opcode it was most recently executed on, this global state is updated and at any time the thread could go in and change a task and right in the middle of an update it could be changed and the system would be left in an inconsistent state so what do we have to do? Something that is important is called a critical section and we have to protect it with locks or queues or some other type of synchronization tool and the idea. is that if two things have to happen together, I acquire a lock that says no one else should be running right now, do the critical section, then release the lock and let other people run, so the challenge in multithreaded programs is to identify all the places. where bad can happen where you can leave it leave the system in an incoherent state input locks around it how many of you have used a lock before and seen examples of it in books yes and you may have seen it in a class of operating system in school and the problem with almost all the published examples I see on how to use the laws is that they are too simple, you have a whole little resource, you have a lock to acquire one to release and because I show you a simple example, It creates the illusion that locks are easy to use, but when you start putting them in larger systems outside of the OS, you'll find that if you add enough locks, it becomes incredibly difficult to reason about your code to know. whether it will ever stagnate, whether it will starve a process or what not, the dining room philosophers problem is an example of this, the simplest example of a problem that most people find difficult to solve using locks, there are correct solutions, only for most people.

Don't do it easily, in other words, we've learned over time that it's tremendously difficult to fix large multi-threaded programs, but at least if you do all the work, do the testing, and think carefully, it's possible to do it right. That's good news and once you have it, you can retire properly. I made a big, correct multi-threaded program with lots of locks, someone else will maintain it for me and it will be fine in perpetuity. The problem is that locks don't lock anything. They're called locks, which is a really, really bad name for them, which is a lot, it's essentially a flag, a signal, and if someone else checks that flag or signal and says, oh, it's locked, I won't touch that resource, but it's just if they check, in fact the lock doesn't lock anything at all, if the law configures a lot to access the printer, what all the other threads are supposed to do is that every time they want to print, they have to acquire the lock .

If they forget to purchase the lock, can they still print? In fact, that is the case, so even if you have a large multi-threaded program that is correct, it won't necessarily stay correct over time, the smallest little tweaks to the code can cause it to become incorrect in a way that It is difficult to see during code reviews, so this technique is quite fragile and most people think that with a lot of experience they have learned to develop a natural repulsion or aversion towards large multithreaded programs and in a version it is not because don't do it.

They don't know how to do it, they just know that it's a pretty difficult task and just getting it right doesn't make it stay right in perpetuity. It's just now, what is the thread limit? Your limit as always. To begin with, you have to have a lot of CPU cycles, but you can't use them all because there is a city change cost and a synchronization cost, so every time you test it consumes some CPU cycle and every time you acquire Malisse . it blocks on every CPU cycle, so what Larry discovered is that if you just remove one big lock and put in a lot of small locks, the total cost of this increases quite a bit, making Python much less performant, we learned something new, so multi-threading will give you more hardware computing power than you started with.

You will always be worse off with threads. It always consumes some of the power, so it never adds power to the system. The question is how much it consumes and we can. Sort the weight of the process switches versus the threads, which is versus the light threads and the green lips and all that, the main reason for the existence of tools like our green lights is that these test switches are quite expensive and therefore So let's essentially try to avoid this. By not paying the cost of other task switching, it's fair to say that with you you're trying to maximize the total CPU power and you're going to waste some of that power.

Like when you start using multithreading, should you use it? threading, the answer is yes, if you don't need 100% of the CPU power, threading is actually a pretty reasonable way to go if on the other hand the cost of your threads eats up the CPU power you need and you need to get Again, there must be a better asynchronous way, so the difference between asynchronous and threading is that asynchronous ones switch cooperatively, meaning they don't interrupt you at an arbitrary time, what you do is continue with your work and then when you come to a good stop. period, go back to the async manager or event loop and say you know what you can let someone else execute now that I've sorted everything, my entire estate is in a consistent state and now someone else can start working and so on to change cooperatively you actually have to alter your code, unlike a thread you have to add explicit performance or somehow cause a task switch, so what's the benefit because you control when it's launched which is a strange, that you practically no longer need? locks or other synchronization primitives, by the way, whenever I make a broad generalization, those things are not always true; in fact sometimes in the asynchronous world we are the equivalent of all locks or synchronization primitives, but in general a big advantage of a sink is many fewer locks and one revolution of the sink, plus the cost of a task is incredibly low.

Who has ever written a Python function before? Who called her? The process of calling that function is more expensive than a task switch in a sink. Who thinks it's something like that? Great, that means asynchronous switches are cheap, cheap, cheap, cheap becauseThey essentially use generators, the generators under the hood store all their state and to turn a generator back on we just have to call that generator and say go ahead and less is needed. It's longer to do that than a function because the function has to build the state, build a new stack frame on each call, whereas a generator already has a stack frame and picks up where it left off, is it fair to say that about all the techniques? of switching between tasks, this is the cheapest and not the cheapest, it is the cheapest by far, so if you need some concurrency and choose between threads and async, start with something that is consuming and needs a hundred and let's say that 75% of your CPU power you add threads and they only cost you 25% of that Bob power, leaving you with 50%, but if you put it in a sink, it consumes 1% of your CPU power, leaving it. with 74, so do you have more cycles left if you use async?

So in terms of speed, asynchronous servers tend to blow threaded servers out of the water and the comparison is: oh, you can run hundreds of threads, but thousands or tens of thousands of asynchronous tasks per second, which is amazing so the asynchronous system is very very cheap so one of the reasons it is popular is that it has such low overheads as everyone sees why people are excited about a sink that now they don't It has locks, that's great, and since you don't have locks, it's a lot easier. to get your code correct, just switch when you have all your duck, when all your state is consistent and you don't have to worry about arbitrary interrupts, so coding is much easier and speed is faster.

And all that is like that, how many of you love to think? It is now easier to do well than threads and is much faster and lighter and can handle huge volumes. Are there any disadvantages? Well, there is a small disadvantage: you have to say performance or weight from time to time. and then say, okay, you have to do the cooperation part, so you have to add a little bit to your code, but that's not very difficult, is there any drawback? Yes, all you have to do is not blocking, you can't. it just reads from a file, you need to start a task to read the file, submit that task, let it start reading and when the data is available, come back, visit it and pick it up, so you won't even be able to use f read.

You have to use an asynchronous version of that read, and therefore pretty much everything it blocks, including sleep mode, you can't use the normal version; in fact you need a giant ecosystem of supporting tools and this dramatically increases the learning curve for starting threads, you say thread. start and be done without async, you need to load an event loop of some kind curieux async i/o twisted etc., you will need to change all your calls to non-blocking calls and then place the async in a way in In other words, the curve learning in this is enormous.

I can teach people threading techniques to thread reliably in just a few hours. I can teach people in a few hours to use multiprocessing correctly and get all the benefits, but I think it takes days to teach a person to use async correctly, which doesn't mean you can't cut and paste an asynchronous example and make it work, but if you're going to debug it and certainly if you're going to get into it you have to know a lot about the event loop itself and if you don't appreciate it, look at the ace of the documentation for a sink in three six and then look at the concurrent futures and when you see the entire ecosystem you will realize that I thought I knew Python, but in fact there is twice as much Python as you know, the other half is asynchronous and they believe it continues to grow, it is spreading its tentacles throughout all language and anything in language that doesn't. fits well with what they think is about to get a parallel version that will be asynchronous, there are context managers in Context Lib that don't reproduce well what they think, so we will get a whole new set of context managers that are async aware from many tools in Python you will get a non-asynchronous version and an asynchronous version and in the end it could double the size of the language, so that is our small disadvantage, but there is also a good reward, what is my opinion?

What will you earn in the future? I think it's very difficult to do threading well in the future, it's very expensive and I think if the ecosystem gets better and better it will become easier to use, the best practices won't be known and once you learn those patterns you can get going. Pretty quick if you were at the talk last night, wu kasia probably told you that they had a lot of success with it on Facebook, there's some ramp up time, but once people have crossed that on-ramp, they can do quite a bit, so here is the comparison.

AC maximizes your CPU utilization, why is it better than threading? Less overhead costs. Threading has the advantage that it works with existing code and tools, so if you have a lot of libraries and you have a lot of existing code and suddenly you want to be concurrent, what will you choose between threading or asynchronous? It's not meant to be a difficult question, but test tuition is important and I'm the person taking the test if you don't get the right answer. it means I can't communicate a very important point, you have a lot of existing code that you wrote and existing libraries that you want to continue using, you want to be concurrent, what do you use, thread clearing, that's it, okay, because asynchronous you will have to almost reorganize completely every thing that blocks needs to get a non-blocking version that says some tools are being written to wrap them up and run them in another process and give them a sort of asynchronous feel II, but those tools also wrap it in a way that's pretty expensive, presents all the disadvantages of threading, so the problem doesn't magically go away, so in general for a complex system async is much easier to fix than threading and yet threading requires very little restructuring, just put some locks and signs and you're done, and async requires a huge ecosystem of future event loops and non-rock versions of everything you learn something new, if you can tell this to someone else and by the way, I give you these notes, you can go read this.

Again, if you can tell this to someone else, you are now qualified to make decisions on your team. Should we use async? They're going to give me more time and they haven't kicked me off the stage. If we're going to look at some code, what could go wrong? Nothing could go wrong because I have the code on the slides and I can always fall back on it. a static demo by the way, I haven't even tested the internet connection here because I had problems with Wi-Fi this morning so I'm connected now so maybe that part of the demo will run so I have two simple examples for you are an example of I have a global variable a counter that I am going to print starting the loop ten times increment the counter print the count print a small bar and after printing that ten times I will print ending let's see that that example looks correct and that would be three sixes.

A basic result is ready, not very interesting. This is an easy program that high school students should be able to write after their first few hours of Python training. It takes very little Python skill to write code like this but it is starting print a loop this is a beginner problem easy make a global variable print starting increment count print out of count result ending how many of you consider it to be easy code for beginners? Now I agree a little more advanced there is another code that says take a list from our website loop through those sites ah open the website read it once you have read the web page get the length of the page invites print the URL and the web page and what What you do is you get the home page sizes on all these websites and you will learn that the Yahoo page is huge and the Pipe Eye page is very, very small.

Well, I consider this to be beginner code. Oh, you also have to teach. person about the packages you know, in Python 3 we have to do a dot request in front, so just URL Lib so the import is a little more annoying in Python 3, although it is somewhat convenient, you can use the with statement to make it Close automatically the URL and releases the underlying socket. I think these may be the skills needed for writing. This can be taught to a person who doesn't know Python. All skills can be taught in less than an hour. How many of you agree on that?

This is some beginner-friendly Python code and it's easy to get your hands on right now. Can they pay you a lot of money to write code that easy? No, because obviously I can't pay you a lot of money for this. My daughter is only in fourth grade she can power up that code after just an hour of training with Python. You call yourself an engineer. My daughter can do it and she is still coloring and watching cartoons. She can't get paid for this kind of stuff. I know what you're thinking. be a harder way, how can we make this hard so we get paid for it?

Well, there are three ways to thread it, you can multithread it well, in a worse way, you can thread it and multithread it at the same time or you can also make it asynchronous, so let's first make threading so that the script style is this here is the obvious result , a function style is to say that I am an advanced programmer. I'm going to take this part and factor it into a function, etc. It would look like this: Your worker has the only job of incrementing the counter and printing the count. All I did was factor these three lines of code and put them into a function.

They are very professional. Kids weren't doing that after one. training hour I write code with functions that are reusable and now I'm going multi-threaded, multi-threading is easy or hard, it's easy to add to existing code because you don't need to restructure it, all you have to do is change a little piece in Instead of saying worker open parentheses close parentheses all you need to do is point to the worker and start a new thread and voila, concurrency by changing just one line of code is impressively easy, in fact that is the case, by the way, I'm a professional, unlike these kids, before I submit the code, I will test it enough, so I will run it and test to prove that the code is correct and in fact it gets the answer I want all of you to be.

I'm thinking about cheating because it's on a slide, but no, I have an electronic computer here and we're going to run the version of the function individually and the version of the function is threatening, so this is the code that just ran and which is a bit small on the screen. I can probably make this part bigger, here we go, okay and the next version was try multi one, so thread multi one is the one we just had on the slide, the part that's different is this part here and I'll run it. threading multiple one there the print starts count up to ten and ends so what I show you is that multiple threading is easy and you can test to make sure your code is correct and ready to ship.

I'm listening. sound effects like some of you don't believe me, can you detect the race conditions? What is the race condition on this one? That's exactly it. Almost everyone I've taught can instantly spot this one that, uh, between the search account. and the write count could run another thread and have gotten updated counts, you can have two reads, each updating the same count, writing it and the consequence of this is that we won't get to ten, which awaits the question why didn't it? . We saw that effect when we tested it, the answer is that this happened so fast that a task is unlikely to be done between the reads here on the right, so it could run successfully a billion times before it fails, but of Indeed there is an error there and it will be quite difficult to observe, but also the printing itself is a race condition because the main thread tries to print and finishes before the others are printed.

However, I would only see that in Python 2 7. I don't see it in Python 3 6, does that mean the bug is gone, the race condition is still there, the task switching logic changed in Python 3 and therefore every time you print them, the tasks disappear immediately, thus eliminating this bug that used to exist? visible and makes it invisible during testing. Good improvement. I don't like to see mistakes. Problems that can be solved. We won't show them to you. Okay, actually, it's problematic that we tested this and it seemed to work correctly. So tests can't prove that the code is correct, which is probably one of the most important lessons of all multithreading.You need evidence that says there is a way to undo some of the effects that made it invisible and the technique is called fuzzing and the idea is every time you call fuzz, I'll put in a random amount of sleep, so you put fuzz in pretty much many of the places you would put a performance thinker in asynchronous code, but you have to put it everywhere because what they sing can you control when those tasks with champions for us and the threads can perform the tests that occur at any time, so that I will put a lint in between each step looking for the previous counter doing the increment making the print making the other print I put a lint in between each one in a confusion between the launch of the threads and the end of the confusion is a technique to amplify the problems so that they become visible, so if you're going to do a test, that's a perfectly reasonable way to do it, so this is multithreading, okay, so it's the one with the fuzz on the right multithreading, and with the fuzz should run a little slower, you will see the start and the count is one.

Oh, there are bugs everywhere? The count is three, it came in twice, keep in mind the the code itself hasn't changed all I've done is just put in arbitrary time delays these errors were always there this result was always that possibility so if someone tells you who tested their multi-threaded code, it makes you feel safe, they tell you. I tested the multi-threaded code used to lend your plane, so everyone is excited about the Internet of Things except me. I am not excited about the Internet of Things for this reason, the reason is that I see errors in the code all the time and when there is a error in the code on my machine or on some website, the consequence for me is that the site website looks weird or my shopping cart empties and there's something I need to fix, the consequence is basically nothing when there's an error in the logic for my self-driving car is going to be bad for the person standing in front of that car. .

It's fair enough that as the Internet of Things gets closer to us. I would like us to develop a growing aversion to multithreaded code. Because it's so hard to fix, I'd rather someone send me a self-driving car. I would prefer the car had been programmed with asynchronous code. It's profoundly better in terms of your ability to look at the code and see if it's correct, who learned something new? Well, I'm talking to a couple of Mozilla people yesterday, where are my Mozilla people? Where is that here? Well, tell me if you know. this person you do it this is a pretty famous photo floating around the internet this is a pretty tall person who is at a standing desk what do you have to do and then right above his head there is this little ticket every office needs that sign posted about that level because a lot of people think they are qualified to write multithreaded code and they are not.

Are there ways to solve the problem? Can you write correctly? Yes, you can be more careful and use atomic message queues. We have one integrated. a Python queue module, but there are many atomic message queues, effectively your email is an atomic message queue, so we can fix the code and fix it with the queues. I put here some tips for you about the rules of Raymond coding. I won't spend much time on them because you can read them later and I have an example. I will say that one category of problems is that step. and B must happen sequentially what is the solution put them both in the same thread and then they will be sequential again its something easy ok another one is a barrier concept one barrier is that you have multiple parallel threads launched and you want to create make sure Before they are all complete, there is a simple way to do it: you join all the threads, keep in mind that every multithreading problem has a corollary in the real world, you have five programmers, each working on a different part of your website you can't publish it until everything is ready so I launched five threads programmer one two three four five and they all start working and then I join tell me when you're done now wait now wait no wait tell me when you're done it turns out it's already you will be done until you answer yes, join, join, join, and after five joins I know the websites are ready and I can publish them, so publishing follows five joins, it is something simple and this is called barrier what about the demon? threat? demon threads What does it mean?

This is a thread that is never supposed to end. He is a service worker. Every time you ask him to do a task, he goes and does something for you and never finishes, but he does it, he waits. So your office printer typically runs a daemon: the printer never turns off, in fact it just waits for someone to send it a print job, when the print job finishes it doesn't turn off, it never comes back, so that for the devil. Threads, you can't put them together because they never end, the printer never turns off, so what do you do?

You join the message queue instead of the thread itself, who designed that API, who is standing in front of you on stage. The only person who was okay was me, so I created that one before. What we used to do is that the only way to know if all the tasks were done is to put a poison message until it says: I asked you to do Tim. things and then the 11:00 thing is that I want you to send me a notification that you've done everything else, so you have to have two ways to send messages to the queues, an in and out message, and a poison pill to get in and go in and take a poison pill to get back and now you can use join which says wait for all tasks to be done okay Raymond's rule number four sometimes you need global variables to communicate between functions you can think that global variables are terrible, but in fact, one of the reasons for using threads is that you can use global variables, the shared state, and therefore the solution, although this can be a disaster, since something that works in a single-threaded program it's a disaster in multi-threaded code you can wrap locks around it, but there is a better way in the threading module: you can mark it as local and that means each thread has its own copy of that global variable, the decimal module does this and therefore when you set up an additional context it is only for that current one. thread and will not change the context of any other thread.

Wow, almost anything that can potentially be paralyzed and has a global state, you should wrap it in a local thread, and this is an important thing because I don't know why people don't do it. I don't seem to understand this, but it happens all the time and you'll see it's a highly upvoted question on Stack Overflow. There are some mass murderers in this room. There are some of you who hate threads and want to kill them. You always ask me my question: how do I delete threads? You call me as a query and say I want to kill a thread and I said there is no public API to kill a thread because it's a bad thing to do now if you really really want to kill a thread and I tell you I'm not going to tell you how to do it, what you're going to do, look it up on Stack Overflow, it's that Cobra flow, they'll show you how to do it and the way to do it is you can load the c-types module into the internals of the Python C API and just make a call and it kills the thread instantly, by the way, if we wanted you to do this, we would have done it. they put a function to do it if the job I wanted you to do it they would put a function to do it everything they did and then they took it away you know why because people used it they went around mass murder threads so, because you did?

If you ever want to kill a thread, remember that the reason for using threads is that you have shared state, and if you have shared state, you have race conditions and you manage these race conditions through a lock, so that whenever you want modify the state in which you acquire the lot, you modify it in melis. What happens if you get killed between acquisition and launch? When you kill a thread, you have no idea if it has a lock or not, if it acquired a lock and you kill it on all other threads. Am I ever going to wait on that lock, instantly dead logs, it's a mess, so I keep a log of all the consulting calls I get and there is a pattern.

On January 15th I got a call from a Cisco friend here Raymond, how can I kill a threat, don't go killing threads, there is no API for that, don't do it on March 15th. I receive a call. We have a problem where we are getting deadlocks. I look back and know the cause in case you are in the thread. business no, if you want threads to die, you have to plan them in advance, you have to have that thread periodically, check a message queue, it's a global variable that says: I don't want you to do your stuff anymore and then the thread itself can release its own locks and exiting gracefully, it takes some extra planning to do this, which was saying that in the context of that particular call they didn't have that option because this was a large system where threads were written by other programmers, not programmers experienced. and if they had an error, we wanted to delete their thread, the problem is that they could crash the entire system.

What is the solution? Don't use threads for this kind of thing. Use processes. We like processes because you can eliminate them. Fair enough, who learned something new? Alright, applying the five rules here, I implemented an atomic message queue leaving the fuzz in and the multi three and you will see that the fuzz is still there, there are still time delays, but it will get the correct response and that's just the application of the five rules to this code. I'm giving you the code, so there's no real reason to study it now. I will show you the clean version of the code.

The clean version of the code is. I take out the fuzzing and you. I'll see, it's actually not that complicated, what I did was take the counter and isolate it in its own daemon: thread now there is a counter manager, the other threads never update the counter, they just send a message to the counter, hey, I want. you have to update and the atomic message queue does them one at a time, we're isolating the resource which was Raymond's rule one, our number two, number one was, if you want things to happen consecutively, put them on the same thread , so after the increment we send a request to print a change that ensures that these two things happen sequentially the printer is in its own --thread daemon and we communicate with it via a message queue, it gets a print job a the time and prints it now whenever you want it prints, it doesn't print directly, it sends a message to the printer queue and says princess, just like you do with the real printers in your office, you never access the printer directly, always it sends a print job and then does the jobs atomically one at a time. once we launch the workers we wait for the workers to finish joining, which was rule three after all the workers say we are finished we say we are finishing by the way we have to do something else yes because a worker has been we launch other tasks, we have You have to wait for them to complete as well, which you also do with a join, if you do anything less than this your code is wrong.

Did we just take an easy problem that had a solution and about six lines of code and create? In fact, it was difficult, we did it, so the original code looks very, very small, but then the new correct code, even cleaned up, is much, much, much longer, that is this careful threading with many, even without confusion, it's bigger and requires more skill to do it, he said it's perfect. It's beautiful and it's much simpler with the use of signs. I know what you are thinking, there must be a worse way, there is another way to solve our problems and that is with locks which are written to solve race conditions and people tend to reach for the locks first. because when you read about race conditions the first thing someone shows you is a lock, you should never show anyone a lock unless they are writing operating systems because locks are hard to use on larger systems so let me show you what same using logs and some threads, I'll get the version with LT blocks, I'll get the clean version over fuzzing and you'll see it's still very short, there's still a couple of joins in there, but instead of message queues, we're using the wood and making a wide printer batch, the impression that the lock acquires is atomically printed and then releases the batch.

This is all done in the context of having the counter lot, most people who learn about locks and try this problem don't. Don't bleed this one under the other and they leave a race condition in your bathtub and because most of the engineers I teach are already practicing engineers, real coders solving real problems and almost all of them make that mistake that tells me that the people don't reason directly about it crashes even for simple problems, but it solves the problem and I canDemonstrate it here by threading the lock clean version and you get the correct answer every time.

Is it possible to create correct clean code? In fact, that is the case, since this is possible. With a little training, how many of you are like that? The blocking approach, well, if you think it would be reasonable, you came across code like this, there's still something you don't like, it's beautiful, the wood declaration does it. beautiful, it's not that difficult to reason once you understand it correctly, there is something else wrong with it what was our reason for starting threads to start with what we want what is the topic of this talk concurrency here is the last of my rules if you put enough locks in a program, you eventually throw away concurrency and it actually runs sequentially.

This program is now completely deterministic and runs the same way every time. Actually, it is isomorphic to the original program, so it is slower and more complex than the original, but with none of the advantages, so people don't realize this from the beginning, they think the problem will To be simple, they do it threading, they get a raise condition, they start placing locks, they have errors, they start placing more and by the time they get all the locks and realize, hey, we're actually slower than Let's start, to begin with, am I a big fan of padlocks?

I'm NOT, note that some locks don't lock anything, they are just flags and can be ignored even though there is a print lock that can print anywhere in the program, they are low level primitives and are hard to reason about queues of messages, they are much easier to reason with and the Morlocks you acquire, the more you lose, the advantages of a concurrency, the more sequential while your program becomes, that is the world of reading. I know what you're thinking, there's a better way, the best way is multi-processing, so this is the script we showed before that loops through all the websites.

In fact, I'm curious if it runs at the moment, whether it runs or not depends mainly on whether the internet connection here is working, so MP alone, if I could. It's just okay, new internet, even if it's connected, it's not okay. I won't demonstrate the program because I am NOT connected to the internet, if I had run it it would take me about 25 seconds to go through all these sites, get all their sizes and print them. This takes as long as the professional code is correct and not broken. said you will get a complaint from a user that makes it sound like it is broken the words they will use are these hangs what does it mean hang does not mean broken hang means take a long time to do something when you want faster results I have this bug on my house everything time.

I am trying to prepare my son for school. It hangs. Go put your shoes on. It hangs. Grab your backpack. It hangs. It hangs. It means he takes a lot longer to respond than you expected. I improve my code by moving the body to a function and get a site size. The good news is that I have used a professional programming technique and it is reusable. The bad news hangs, so we wonder if any of this may parallel Eliza Boleyn. Oh, you get the benefit. Many of the benefits of concurrency come only to the parts that are parallelizable, and the important thing is that not everything is parallelizable.

Some things are inherently sequential. The classic example is the creation of a baby. It takes nine months to create a baby. Five workers are put on the task. It doesn't take five times less time, you don't spend nine workers on the task and you have a baby in a month, so you get a point of diminishing returns very quickly for additional workers on the task, but then there are a few things. which are parallelizable like mowing the lawn, so if two people mow the lawn it takes about half the time, not exactly half because there is some overhead and coordination between the two, but some improvements can be made, most of the problems are on a sliding scale between making babies. and mowing the lawn and this is quantified in something called Aldo's law.

If you want to pass your coding interviews when you go out to interview, be sure to mention those laws every time someone mentions concurrency. I've said it here, but basically it says that there is some part of a test that benefits from running in parallel and some part that is inherently sequential and in those walls you have a formula that calculates both and tells you the maximum potential benefit, so in In this case I look at what is happening here and say: can I parallelize it well? what are you doing internally if you are making a DNS request for the URL then you have to get the response, then you acquire a socket, then you make a TCP connection, then you send a HP request, then you qualify the response, you get all the packets and then it counts all the characters on the web page, is it sequential or parallel.

Liza ball, the notes say it's not parallelizable because you can't establish a TCP connection until you know the IP address, you can't know the IP address until you've sent a request. you can't get the results until you've seen a request and you can't count the characters on the page until you have them that say you can count the characters in parallel as you get a packet, count the characters inside so there's a little bit of parallelization available here, so this task itself is, I would say, 95% baby-making and only 5% parallelizable, so it's basically not worth all the work it would take to parallelize it.

That says: what if you want a hundred babies? Well, you can't do it in less than nine months, but with 50 workers you can do it in 18 months and with a hundred workers you can do it in nine months, so, in fact, that's what we're going to do here. We actually want a dozen different babies, so we use multiprocessing or a thread pool and this will reduce the total time from about 25 seconds to about two seconds. The speed at which it runs now is the creation speed of the slowest baby. It turns out that the slowest website here Ars Technica is the one that takes the longest to load and determines the total execution time that your program cannot finish until it gets a response, so our speed is now covered by something external, which is the speed at which we are going. data can be sent, this code is still quite simple and beautiful, it is also very easy to do well, do you like multi-processing?

And the one I saved for last, I'm almost out of time, oh, I have two things, one. is combining threads and branches, let me say I covered this very quickly, we get errors submitted all the time. I've linked to one of them if you'd like to take a look and it's basically code that works like this. was a summary of the bug report once condensed, someone said I ran this and it crashed. They have a thread pool executor and they are using multi-process processing, so they are using the two techniques together and letting it hang, it crashes. never ends Python is broken.

I'll quote Uncle Tim Peters, who said he should put quotes. Remember those of you who believe that if you mix threads and forks you are living in a state of sin and deserve what happens to you. I'm a little softer about it. I would say that if you are going to mix threading and porking, there is a general rule: thread before you fork, not after, the problem is if you thread first when you fork, all the threads copy each other and share the same locks. yeah, ah, okay, yeah, I'll fix it, okay, there's no thread after your portability.

I read it. The words are good on the page. I said it wrong. I don't need to fix the slides. The slides are correct. Thanks five dollars, uh-huh and a family member. Hey, someone was in the workshop yesterday, how was it? How did people know that I didn't tell people I was going to be at the workshop just like I didn't tell them what was in the keynote? People came anyway. What is worth showing? Well, ten dollars now, by the way, on the main page is my contact information if you want me to do training for you and I have free training videos for those of you who have Safari online, so you're about to kick me out.

Offstage, but I'd like to briefly talk about this mountain of code I created for you, an asynchronous server. I don't think an example like this exists anywhere on the web. There are very, very complex event loops, but there is nothing that shows from the basics how to use async. The part at the top is a server I wrote from scratch. I made my own miniature version of Twisted, let's focus not on that but on the user business logic at the bottom, the users business logic. Do they have an ad, a feature that prints and they want this to run every 15 seconds?

They would also like to run a server that asynchronously receives the connection from multiple sources and the person could switch to a caps or caps mode and each character they send receives caps or caps and we want to handle multiple users which does make it synchronous, well we say async here and everywhere we would have blocked we used a non-blocking version of the read line and you go in and wait and that's it, otherwise this code looks almost exactly like the single-threaded single-process version. It's pretty easy to write and pretty easy to fix, so this is a working code for those who want to experiment from first principles by writing their own say sync and a weight and here we go on the top left is the code we just finished Seeing in the little notice on the server that I set it to a non-blocking mode and I'm using select between sessions at the bottom left I'll turn on the server okay now it's waiting for a localhost and at the top right I'll go to talk to him.

I'll tell you that on local host 9600, okay, on the bottom left, how about this on the bottom left? It says it received a connection and here is the message we are starting in uppercase mode. I say I love Python and it answers in all caps, while below someone else tells the network to go into localhost. I am simulating several clients. Here we don't even pretend that it's actually multiple clients, it's in caps mode and it says, "Well, Ruby is good too, okay and respond to that, each one got their own response, but at the top I can send the title and will change to a case-sensitive title mode and I'll give you a big Texas greeting note, the capital H but the other letters lowercase, but a greeting at the bottom is still all top in other words, each user has their own single state and this list is easy to write. single threaded mode, all I had to do was switch to non-blocking, put the asynchronous everywhere, put a weight instead of blocking, but I can't. use the normal read line, I have to use the only one provided above the reactor or the event loop is essentially an infinite loop that says if something comes into any socket, it fires a callback and also finds out if in the heap. events there is an event scheduled, execute that task.

This code looks very simple. The real version is in you. See, the tornado code looks almost exactly like this. The heart of almost all event loops has code almost identical to this. The part that is different is asynchronous. What I have been simplistic here is not the handling. all the different types of futures and introducing error handling callbacks and for that I think IO uses current concurrent futures, so asynchronous IO is basically all this code on top on steroids, dozens and dozens of things They don't block and future concurrent, so finish, thanks. Thank you very much for inviting me to your comfort.

Watch Video & Subscribe

If you have any copyright issue, please Contact