YTread Logo
YTread Logo

Python Tutorial: CSV Module - How to Read, Parse, and Write CSV Files

May 31, 2021
Hello. How is everyone doing in this video? Let's see how to

read

,

parse

and

write

CSV

files

now if you don't know what CSV

files

are, it means comma separated values. Basically, CSV files allow us to put some data into a plain text file and use some. Delimiter type usually a comma to separate the different fields. Now I have a sample CSV file here that we can work with and if we look at this we can see how they are generally structured, so it may look like a mess, but it's not really intended to be

read

directly.
python tutorial csv module   how to read parse and write csv files
This is just how the data is stored. And then we can use our programs to

parse

the information that we want, but we can see that the top line here has our fields and now our fields in this file. are first name, last name, and email, which tells us what information we should expect to see on each line, so if I go to the next line here, we can see that John is the first name and then a comma is the last name. and then a comma and then This long email here is the email, that's why they are called comma separated values ​​and what separates two values ​​is called a delimiter So the comma is a common delimiter But you can use almost anything, so a Sometimes see files with values ​​delimited by tabs or dashes or things like that.
python tutorial csv module   how to read parse and write csv files

More Interesting Facts About,

python tutorial csv module how to read parse and write csv files...

But they are all called CSV files. Now let's see what it's like to read parts and

write

to CSV files. I have a file here called Parse CSV pie and inside this file we are going to import CSV. Now you may have looked at that data and wondered why. We're just not using the string splitting method on every line of the file to parse the data and you could do that, but the CSV

module

makes parsing these files much easier, for example if someone puts a comma or something in his name. for some reason, so we wouldn't want to split up on that and plus the CSV

module

will handle new lines and all that stuff, so it makes it a lot easier to parse all the information we want without writing something complicated from scratch. , then to read the CSV file, we will simply open this file.
python tutorial csv module   how to read parse and write csv files
Just like any other file, we'll use a context manager here and we'll say with open and the name of that file that you were looking at now is CSV, and it's in the same directory as the file that I'm currently writing to and we want to read this file. So we'll put an r in there as a second value and now how we want to call it, we'll say CSV file. So to read this file, we can say CSV reader and that can be. any variable name you like, but that's what I like. And then we can say CSV and then use this reader method and then pass that CSV file to that reader method now in the background that reader method is using something called dialect which has some preset parameters for what it expects the format to be. our CSV file, so by default it expects values ​​to be separated by a comma and a few other things we'll look at in a moment, but since our CSV file is pretty simple we don't need to pass any additional arguments right now.
python tutorial csv module   how to read parse and write csv files
So the CSV reader variable we just created will be something we'll need to iterate over. So for example, if we print this as is, print the CSV reader and run that then we can see it right now. It's just an object in memory, so we need to loop through all of these lines in the reader and see what we get so we can say for line in the CSV reader and then print each line. And we will execute it well so that it can be seen. better, then each line we are printing is a list of all the values, so that the first value in the list is the name as first name.
The second value in the list is the last name and the email is the third value if you scroll to the top and you can see that our first line is the field names so it tells us that the first value that you know, this first name last name is the second value and the third value is email for example if we are going through the index like this it would be index 0 and then 1 and then email would be index 2 if we just wanted print all the indexes, so on this line here, we could say, let's print index 2 of each line. if we run that and we can see that we now get all the emails printed now, if you don't want this first line of the field names and you just want the values, then we can skip that first line, so if anyone has seen it my video about generators and we can actually step over the value of an iterable by calling next and executing next.
We will return the following value if we want to capture it in a variable. But if we just want to go over the value, then we can go up. here before our list. We can just say next CSV reader and that will loop through that first line. And then when we repeat this, it should start at the second value. Who is the first person on the list? So if we run this again and scroll to the top? Now we can see that John Doe is now the first value, okay? Now let's see how we can write to a CSV file.
We can do this with any list value, but since we already have a list of values ​​here from our original CSV file, let's go ahead and use them, so let's say we wanted to save these same values ​​to a new CSV file. But now use dashes instead of commas for the delimiter. Dash probably isn't a great delimiter, but I just want to show you something that happens when we do this now first. We will want to write the field name headers to the new file. So let's remove the following statement where we will skip them. Now I'm going to go down here and now, on top of our loop, we're going to want to open. a new file to write and then we will say with open and we will call this file with new CSV underscore names.
We want to open this for writing, so the second argument is aw, then we'll say like and we'll just call this is a new file variable and to write to this file. We're going to use a CSV writer so we can say CSV writer and that can be any variable name. But that makes sense to me and we'll do CSV and then use this writer. Method and we're going to pass a new file to that write method now, if we leave it like that it would just write the same comma separated file that we currently have, but if we want to use dashes as our delimiter then we need to pass that as an argument.
So it will be the second argument of that write method and we can say delimiter equals. And now we will use a script. We want to write every line of our original CSV file to this new file. , so let's indent our for loop here so that we are now inside the context manager for this new file and for each line in this CSV reader. What is our original file? We want to write it to a new file so we can do it by saying CSV write dot write Row and Is the row we want to write that line from the original reader?
Very quick, before we run this, we open the original file for reading and then we create this CSV reader variable and we use the CSV reader method to read that original CSV file and then we open a new file for writing called new CSV names and then we are creating a CSV write variable and we are using this CSV module write method to open a writer using that new file with a delimiter of a dash and then for each line in this original. The CSV data we are writing to the new file is every line of the original file.
So if we run this, we won't have any results here at the bottom. But I should have created this new file called new CSV and I will do that. Go ahead and open it now. We can see it in this new file. Hyphens are now used instead of commas for the delimiter. This makes it quite difficult to read, but I wanted to show you what it did with two of our values ​​here. So in our first value, the email actually contained a script so we can see it here. our CSV writer knew to put quotes around the email since it can't contain that delimiter.
And that's so, when he re-reads the CSV, he will know that the email has a full value and that it should not be split into the script. within the email itself and in the same way here. We can see that our second person here has a hyphenated last name of Smith Robinson, so again the author of the CSV knew to put quotes around the last name so that he can tell the difference between the delimiters and the values ​​that simply contain hyphens. Now that we've seen how it works, let's change this delimiter for the new file to something that's a little more common, so aside from commas, tabs are very common Des limiters.
So let's use tab instead and in Python. the tab can be represented with this backslash t and if we run it again and then open the new names file again, we can see that now all the values ​​are separated by tabs. That's much easier. Now it's much easier to read. Just like we passed the delimiter to our writer, if we wanted to read into that tab-delimited file, we could also pass the delimiter argument to the reader, and real quick, let me. show you what it would look like if we tried to read a CSV file with the wrong delimiter.
Let me copy part of this here where we're reading in this file. And now I'm just going to comment on everything else for now. Instead of reading the original CSV file names, we're going to read the new tab-delimited file we just created, which are new CSV underscore names now. Suppose we forgot to specify the tab delimiter and try to read this as is. let's print the lines we get from this reader So we'll say four lines in the CSV reader and print each line So we can see that each line only has one value And it didn't split into the values ​​in the tab because it was expecting commas.
So instead, you need to explicitly pass in that we want the delimiter to be a tab, so I'm going to pass it to the read method here and say that the delimiter is equal to a backslash t for the tab and then I'm going to run it again and now I can see that we get the analysis right, okay? So now I'm going to delete these lines here and uncomment what we had before. Well, the way we have been working with CSV files using the reader and writer is probably the most common way to work with CSV data, since they are the first things that appear in the Python documentation.
But my preferred method is to work with CSV data using dictionary reader and dictionary writer. So let's take a look at them and I'll explain why I prefer them to the regular reader and Okay, writer, let's first take a look at the dictionary reader. So to use this, we're just going to replace the normal read method here with a dict reader and now let's print the lines that we give with this, so I'll say. four lines in the CSV reader and we will simply print each line. Well, at first glance this may seem a little more complicated.
Each of the values ​​is now a sorted dictionary and if we scroll to the top, we can see that. that first line no longer contains the field names. Start immediately with the first person. So the reason is that the field names are now the keys to each of these values. The reason I like this is because it makes it much easier to analyze the information we want, for example, remember when we use the normal reader, if we want to print the email address, we print the second index of our line well to anyone who reads your code.
It is not obvious which that second is. The index is so they would have to go into the CSV file to find that information. But now that we have those fields as keys to our dictionary, then we can get the email here by saying I just want the email from that line. So we just agree. that key, so now if we run it again, we can see that we now have all the information from the email. Okay, now let's see how to use the dictionary writer. So I'm going to delete this loop and then uncomment the rest of this. information here Now with the dictionary reader.
We didn't really need to change anything, but with the dictionary writer we actually have to provide the field names from our file, so one line above our writer here I'm just going to create a list of the field names. And now instead use this writing method. Instead, we will use dictwriter. Now, one thing we need to change here is that after the file we're going to write to, should we do it? Pass those field names, so I'm going to say that the field names are the same as the field names. Okay, and now we're ready to write the data, so with the dictionary writer you have the option of whether or not you want to write those headers. .
What are the names of the fields in the first row? So if we want those headers, which I do most of the time, then we can say CSV Writer Dot right header. So we will write those field names as the first line and once the headerbe written We can loop through the lines of the original file Just like we did before and say CSV writer dot row right and then pass that line so everything stays the same, so if we run this and then look here at our new CSV file names, then we can see that it still works and like I said before, the reason I like working with the dictionary reader and writer is because it's more obvious what you're doing.
So let's say, for example, that in our new CSV file I really only wanted the first and last name and I wanted to leave the email okay with the usual reader and writer. We would be modifying the indexes of those lists and, as I mentioned before, when looking at an index it is not obvious what value it is supposed to have. But with our dictionary writer? We can just remove the email from the field names up here and before writing each line inside our loop here. We can just remove the key and value from the email and one way to do that is to just delete it so we can say delete the email from That line now when you write that row you will only write the first name and last name and the email will no longer exists.
So if we save that and run it, then I open the new namepoint CSV file here. You can see that now we only have a tab delimited file of firstnames and lastnames and that email is no longer there. Now there are several ways we could have written this row. Could we have deleted the Line email key just like we did here or could we have created a new dictionary? With just the first and last name keys and passed them to the correct road method, whichever way works best for you in this case. I think it was easier to just delete the email key, okay, so I think that will be enough for this video.
I hope you now have a pretty clear idea of ​​how to read, parse and write CSV files. But if anyone has any questions about what we covered in this video, feel free to ask them in the comments section below and I'll do my best to answer them. If you enjoy these

tutorial

s and would like to support them, there are several ways to do so. the easiest ways to just like the video and give it a thumbs up and it's also very helpful to share these videos with anyone you think will find them useful and if you have the means you can contribute through Patreon.
And there is a link for that. page in the description section below, be sure to subscribe for future videos and thank you all for watching.

If you have any copyright issue, please Contact