How to Edit Excel Sheets with Python using openpyxl

How to Edit Excel Sheets with Python using openpyxl

Let's say your boss gave you the task to compare names of employees in different regions on separate excel sheets and tells you to state on the first excel sheet on a separate column whether or not the employee is working at another branch by the end of the day. Easy right? It is definitely, but he actually gave you 750 names to compare and return by the end of the week, doesn't sound all that easy anymore right? And of course, we're humans and are prone to mistakes. A mistake might happen if we perform it manually and do not automate it.

In this article, you'll learn how to solve this excel sheet task of 1 week, in a few minutes with Python's openpyxl library (a beautiful lifesaver 😍).

πŸ“š Prerequisites

  • Python and PIP installed

  • A basic understanding of python and of programming

If you don't have Python and PIP installed use this link and this link respectively for them.

πŸ›  Setup

Step 1: Run cmd as an administrator, and enter

pip install openpyxl

Command prompt image showing installed openpyxl library

Once you're done, the command prompt should look like the one above.

Step 2: Open a python editor of your choice and create a new python file in any directory and type import openpyxl to import the package. I'll be using VSCode for this and naming my file automateExcel.py.

python file with openpyxl importation

Now that you have everything set up, time to delve into the meat of the matter.

✍🏾 Using Openpyxl

Now, to complete the given task above, we'll be using two similar examples and we'll edit those documents, then you can use that same concept to solve the task above and any other occurrence. Here, we have the two excel files (located in our current directory) that we'll be comparing with more than 100 names in each of them and with 'company-in-nigeria' and 'company-in-rwanda' as their respective names.

company in Nigeria excel sheet

company in Rwanda excel sheet

We want to compare the two sheets, stating on the 'Present in Foreign Department' column whether or not the employee name exists elsewhere.

To do that, let's assign the two file paths to two separate variables.

Copy and paste this code to your editor.

firstFile = r"company-in-nigeria.xlsx"
secondFile = r"company-in-rwanda.xlsx"

The r in front of the windows file path denotes a raw un-escaped string, without it, the string would not be read and would produce an error. Check here for more info.

Next, use the load_workbook() method on openpyxl to open our excel sheets, storing the return value for both files in two other variables i.e

wbObj1 = openpyxl.load_workbook(firstFile)    
wbObj2 = openpyxl.load_workbook(secondFile)

Then, we read the available sheets using the worksheet list from the objects, we opened above.

wbk1 = wbObj1.worksheets[0]
wbk2 = wbObj2.worksheets[0]

Since there is only one sheet available, we obtain it via indexing as shown above, and since it's the first sheet, our index is 0. We would need this to be able to edit later on.

Now, let's create a names list to contain the names in our second file (company-in-rwanda) which we'll use to compare with the names in the first file. The code below appends all the names in the first column of the second file, starting from the second row (skipping the header) till the end.

names = []
for myRow in range(2, wbk2.max_row + 1):
    names.append(wbk2.cell(row=myRow, column=1).value.lower())
wbObj2.close

Use the max_row property on the current or selected worksheet to get the number of rows present, I'm adding + 1 because obtaining cell values is not index-based. To obtain a particular cell value from the sheet, we call the cell method from the worksheet, and in our case is the second excel sheet (wbk2) which accepts two parameters, row, which is the row number of the current sheet and in this case, starts counting from the second row (skipping the title), and column which is the column number of the current sheet, in this case, is the first column (Names). And we obtain the value of the row and column with the value() method. We make all names lowercase with the lower() method before we append it to perform a non-case sensitive search. And finally, we close the file with the workbook object (wbObj2) set since we won't be needing it anymore.

Next, we would need to compare names in both files and write the results of our comparison on the first file - company-in-nigeria.

for rowNum in range(2, wbk1.max_row + 1):
    if wbk1.cell(row=rowNum, column=1).value.lower() in names:
        wbk1.cell(row=rowNum, column=2).value = "Present"
    else:
        wbk1.cell(row=rowNum, column=2).value = "Not present"
wbObj1.save(firstFile)
wbObj1.close

Note: Close all opened excel files you'll be using or referencing as you won't be able to perform operations on them while it's opened.

Once we've made all our non-case sensitive comparisons and stated whether the names were present or not, we need to save our edited file with the save() method from the workbook object, passing in the file we edited, which in this case is firstFile as shown above (Don't forget to run your code!). Here is the result.

gif for result of code

Check out the full code here. Use this same method for your 750 names in different excel files from earlier and you're good to go!

πŸŽ‰Congratulations

Congratulations! πŸ‘πŸ‘ You just finished a task of ~ 1 week in less than 10 minutes! Go ahead and take the remaining time to watch anim...I mean be more productive!

πŸ“– Read more

Thank you for reading and have a nice day! 😊

Β