Python Tutorial: Real World Example - Parsing Names From a CSV to an HTML List
hey there how's it going everybody in this video we're going to be taking a look at a
worldproblem that I ran into and we'll walk through how to write a
Pythonscript to solve this so I've done videos like this before and everyone seem to find them useful now the difference between these videos and my normal videos is that I'm not going to go into as much step-by-step detail of every little step I'm just going to walk through how I came up with a solution and you can follow along so here's what I want to write a script to do so some of you may not know this but for anyone who contributes to this channel through patreon I
listeveryone on my websites contributors page as a small way of saying thanks well the problem that I'm running into and it's a great problem to have is that the contributors are getting up into the numbers where it's hard to keep track of who i've added to the site and who i haven't so i want to automate this process with
pythonso that i don't ever miss anyone and luckily patreon provides a downloadable csv file of all the contributors which will make it easy to automate this process so if you don't know what csv files are it stands for comma separated values and basically CSV files allow us to put data into a plain text file and use some type of delimiter which is usually a comma to separate the different fields so in this video we'll be learning how to use the CSV module to parse the CSV...
file count the contributors and then put their
namesinto an HTML unordered
listthat I can drop in to my website so let's go ahead and get started now first of all I don't want to expose anyone's information here so the CSV file I'm going to be using for this video takes out all of everyone's personal information and it just has fake
namesinstead of the
namesbut other than the
namesbeing fake this is almost identical to the file that I downloaded from patreon so I'm going to open up this file it's called patron CSV and when I first opened up this file there are a couple of things that pop out to me when I first see this so first of all our first row is our headers and so we can see that it says that the information in this file is going to be first name last name email pledge lifetime status country and start now I
really don't know what all those fields mean but basically I'm only concerned with the first name and last name so that's okay also I noticed that there are a couple of lines here after the header that aren't actual data it's just a line explaining the people below this line are the ones who've said that they don't mind being
listed on the website as a contributor and then it looks like the actual people start on line five here now on patreon you can also opt out of rewards so there's likely a line in the CSV file that is a cut off for people who said that they only want to contribute but...
don't want to be
listed on the website and actually if we look down here at line 35 we can see that cutoff point where it says that the people
listed below this point do not want the reward and don't want to be
listed on the website okay so now we have a basic idea of the data that we want to capture so now let's go ahead and start coding this so in the same directory I have a blank file here called parse CSV pi and I'm going to open that up now first thing I'm going to do is import the CSV module and you may have looked at that CSV file and thought that hey that doesn't look difficult to parse so why not just use the split method on each line of the file to get that information and it's true that you could do that but the CSV module just makes
parsingthese files so much easier so for
exampleif someone put a comma or something in their name for some reason then we wouldn't want to split on that and also the CSV module will handle new lines and everything like that and it just takes all the guesswork out of working with things like this so we're going to use the CSV module okay now I know that my end goal is to output an HTML unordered
listso I'm going to create an HTML output variable and set this to an empty string for now and we'll populate that as we go and I also know that I want to capture all the
namesof everyone that I want to add to that output so I'm going to create an empty
namesokay so now let's...
open up our CSV file just like we would open up any other file so we're going to use a context manager here and we're going to say with open and this is called a patron CSV and we want to read this file so we're going to pass in and are there and we'll just call this data file now I'm going to show you two different ways to parse the CSV file first I'll show you the most common and then I'll show you my preferred method so the first way we'll do this is with a CSV reader so I will say CSV data equals CS v dot reader and now we're going to pass in that data file and actually let me make this text a little bit bigger here just so everyone can see as we're going along I think that's better so that reader method should have parsed the CSV file and put the data into our CSV data variable so let's print out what we have so far to make sure that it looks right so I'm going to come down a couple lines here and just print out CSV data and run that ok so right now we just get this CSV reader object now you may have been expecting all of our CSV data now the data is there but this object is an iterable and behaves like a generator and what that means is that we have to loop over it to get each line so you can either do that line by line or you can just convert it to a
listand get all that data at once so if we converted this to a
listand printed it out then we can see that it prints out a lot of information here in
it's not the easiest to read but it looks like our data so it's a good start so now let's actually print it out line by line so we can see this a little bit better so to do this we can say for line in CSV data and now we're just going to print out each line and run that and when we run that we can see that it is a lot easier to read so if we scroll up to the top and look at the first two lines we can see that the first line has the headers and we
really don't need those other than to know which index each field is located so the first name is that index 0 and the last name is that index 1 and it looks like the second line is the line telling us that these
nameswe want to put on the website and then the third line is the first person with the name John Doe so we
really don't need these first two lines here we just want to get the
namesof the people so if anyone has seen my video on generators we can actually step over values and then iterable by calling next so let's call next on our CSV data twice before printing out this loop so we'll just say next CSV data then we will copy that and paste it in again now we don't need to capture the output from these and any variables we just want to throw them away so now if we run this and scroll back up to the top then we can see that now our first line is the first person of John Doe okay so great now let's remove our print statement well actually before we do that it's not obvious why...
we're running next twice on the CSV data here so it's important to comment non obvious stuff like this while we're going along not only for other people but for yourself also so you can come back to this code and a few weeks and have no idea why we ran these two lines here so let's just go ahead and make a comment that says we don't want headers or first line of bed data okay so within our loop here we're looping over every person in the CSV file now remember that the first name is index 0 and the last name is index 1 so let's go ahead and add each name to our
namesthat we created at the top now to do this we'll say
namesdot append now we want to append the a string of the first name space last name and to do this I'm just going to use an F string and then these braces for a placeholder and we'll say line and then index of 0 for the first name then a space then another bracket for the placeholder and then index of 1 for the last name now like I've said if you've never seen a string with an F in the front like this this is called an F string and they're new to
Python3.6 so if you're not using 3.6 or later then this isn't going to work for you you'll have to use a regular string format now I'm
really liking these F strings so far and basically it's a much simpler way of doing string formatting so if you'd like to see more about them then you can watch my video on strings where I go more in...
depth into all the different ways to format strings but basically all we're saying here is that we want a string with the value at index 0 of the line which is the first name and then a space and then the value at index 1 of the line which is the last name okay now what we've appended those let's print out all the
namesthat were appended to that
listsince that's a global variable we can print that outside of our context manager all the way down here at the bottom so we're going to go down to the bottom and we'll say for name and
namesand then we'll just print out the name okay so this is looking good it looks like we have the first
namesand the last
namesnow if we scroll through our
nameshere then we can see that one kind of sticks out and this is that no reward value so if you remember there are
listwho opted out and didn't want to be included so every name after this no reward value here shouldn't be added to our
listwell if we look back at our original CSV file here we can see that this no reward line has a comma after no reward so this should get parsed as a first name so let's add in a check for a first name of no reward and then we'll break out of our loop as soon as we see that value so within our loop here we will say if the index 0 which should be the first name is equal to no reward then we are just going to break out of that loop now before I run this we should note that the name before no reward...
over here in our file is Maggie Jefferson so when we rerun this this will hopefully be the last name in our
namesso go ahead and rerun this and when we rerun that we can see that this fake name down here of Maggie Jefferson is the last name and
listso that works ok so now that we've tested to make sure that our
namesare right we can go ahead and just remove that
listwhere we're printing or that loop where we're printing out all the
namesnow we're pretty close to being done here so the hard part is over now we just need to get these
namesinto an HTML unordered
listso that I can drop them into the site so first on the site I'd like to
listhow many supporters there are so we'll first add that to our HTML output with paragraph tags and to count how many there are we can just use the length of our
listso I'm going to say HTML output plus equals because we want to append to this then I'm just going to use another format string here to put in these values okay so like I said I used another X string here and we're using these HTML paragraph tags here that we're adding in now the only
Pythondata that I'm adding in is the length of the
listso when this gets printed out it should substitute the actual number there so just to make sure let's go ahead and print out this HTML output so print HTML output and I'll run that okay so apparently there are 30 people in that
listso now let's...
create our HTML unordered
listwith each name so we'll add an unordered
listto our HTML output and I'll do this above where we're printing that output so I'll say HTML output plus equals and then an unordered
Pythonnow I'm going to put a new line there first an unordered
listis this UL tag now that new line that I added in will just make it a lot easier to read when we actually print this out so now let's loop through all of our
namesand add each one to an HTML
listitem so if you aren't familiar with HTML then don't worry about it too much it's more about the process of just automating this process that we're after here so right here we'll say for name and
namesand now we want to add each one to that HTML output so we'll say HTML put plus equals ok and this is another epstein here so first we're putting in a new line with a backslash in and then we're putting in a tab with the backslash T and then the
listitem is this Li tag here and then we're putting the name this is the
Pythonvariable that we're using it's going to sub to this out and then an HTML these forward slashes close out an HTML element so we're closing out that
listitem okay and now after we have all those
listitems they're outside our loop we're going to have to close off the entire
listaltogether so we'll say HTML output plus equals and we'll close off that
listitem with a ford slash ul and let's not...
forget to put in a new line there at the beginning just to clean up how this prints out okay so now let's print this out and see what everything looks like so I will raise our output here okay so this is looking good so at this point I think that this is the exact output that we wanted so at this point we could be done but I wanted to show you one more thing I told you that I'd show you one more way to parse the CSV file that I prefer more than using the reader method and what I prefer is to use the dictionary reader and we can use this by saying dick reader now the difference between the reader and the dictionary reader is that the dictionary reader turns each line into a dictionary instead of a
listand the dictionary has each field as a key and then the data as the values so let me just loop through and print out these values so you can see what this looks like so first let me just comment out these lines where we're doing all of our looping and everything and I'm also going to comment out the HTML output for now and we're going to see what the output of this dictionary reader looks like and to do this we'll say for item in CSV data and then we will just print out each item so let's run that okay now at first glance this looks a little messy especially since my text is so large here but each of these lines is an ordered dictionary so the first line with the field
namesis no longer here now those are being used as keys for the dictionaries so...
the first throwaway value is still here as far as this being the description of the Reward line so this first ordered dictionary here is our first line and if we look at our second item now this is our first person because we can see this is that John Doe person so now instead of accessing index 0 for the first name and index 1 for the last name now we can access those directly through the first name and last name keys and I think that is a lot more readable for anyone looking at your code so now to get this working again we're going to get rid of this loop where we just printed everything out and now will uncomment out all of the logic here ok and now that the headers are no longer included in the output we only want to skip over that one first value and let's see and we can fix our comment here and just say we don't want first line of bad data and now instead of using index where we access the items we can now use the keys of first name and last name so now we're going to say if the first name is equal to no reward and then we want to append the first name and append the last name ok and now let's go down here and uncomment out this HTML output and see if this works so now I'll go ahead and rerun this and if we scroll up we have Maggie Jefferson here at the bottom and if we scroll up we can see that there's still 30 contributors and John Doe is our first one so that seems to be correct ok so it looks like our results are good so it took a little...
while to write this script but now it's going to save a lot of ton of time by automating this in the future and will also prevent me from making any mistakes so one reason that I'd like to show you all these quick scripts that can automate a repetitive task is just to show how you can save a lot of time by writing a very simple script I mean this script here is only 26 lines and it's going to save us a lot of time and I also want to show that you don't need to overthink these one-off scripts too much so I could probably come into the script and add error checking and also some kind of object-oriented approach to this but for what I want to use the script for I
really don't need it to be overly complicated so if you have some problems that you think that you can automate then just give it a shot and don't think too much about how perform or clean everything is it's just a great way to learn is just by doing and experimenting now if any of you are interested in a more detailed look at
parsingCSV files and writing CSV files then I am putting together a video specifically on reading and writing CSV files that I'm going to record very soon so be on the lookout for that okay so I think that is going to do it for this video I hope that you all found it useful and if anyone has any questions about what we covered in this video then feel free to ask in the comment section below and I'll do my best to answer those and if you enjoyed these
and would like to support them then there are several ways you can do that the easiest ways to simply like the video and give it a thumbs up and also it's a huge help to share these videos with anyone who you think would find them useful and if you have the means you can contribute through patreon and there's a link to that page in the description section below be sure to subscribe for future videos and thank you all for watching you