Merge multiple CSV files with various headers into a single file in Python

Mathi Maheswaran
2 min readAug 31, 2022

--

We might need to deal with large data sets when performing data analysis using CSV files. In these situations, we must combine all of the data into a single CSV file. By following the examples given below, we’ll learn how to join CSV files using Python.

Prerequisites

Download and install the newest version of Python on your computer if it doesn’t already have it.

Download Link: https://www.python.org/downloads/

After a successful installation, we must install the pandas package.

Install pandas

Open your terminal and install the pandas using the below command.

pip install pandas

Now let us say you have a CSV file with a variety of headers. The header names must then be consolidated.

Example

First CSV

Second CSV

Third CSV

For instance, you need to extract the header value listed below.

CompanyType, companySize, domain, founded, tagLine,website

Import all required packages.

from csv import DictReader
import glob
import os
import pandas as pd

Get CSV folder path.

path = os.getcwd()
csv_files = glob.glob(os.path.join(path, "*.csv"))

Define the output CSV Header.

modifiedHeaders = ['CompanyType', 'companySize', 'domain', 'founded', 'tagLine,website']

Read the CSV files from the specified folder and write them into a single file.

Complete Script

The formatted.csv file with the chosen header will contain the final output.

Conclusion

I hope it will be a great help to you and save you a lot of time. Please follow my page and leave comments on my posts.

Thank you !!

--

--

No responses yet