Merge multiple CSV files with various headers into a single file in Python
We might need to deal with large data sets when performing data analysis using CSV files. In these situations, we must combine all of the data into a single CSV file. By following the examples given below, we’ll learn how to join CSV files using Python.
Prerequisites
Download and install the newest version of Python on your computer if it doesn’t already have it.
Download Link: https://www.python.org/downloads/
After a successful installation, we must install the pandas package.
Install pandas
Open your terminal and install the pandas using the below command.
pip install pandas
Now let us say you have a CSV file with a variety of headers. The header names must then be consolidated.
Example
First CSV
Second CSV
Third CSV
For instance, you need to extract the header value listed below.
CompanyType, companySize, domain, founded, tagLine,website
Import all required packages.
from csv import DictReader
import glob
import os
import pandas as pd
Get CSV folder path.
path = os.getcwd()
csv_files = glob.glob(os.path.join(path, "*.csv"))
Define the output CSV Header.
modifiedHeaders = ['CompanyType', 'companySize', 'domain', 'founded', 'tagLine,website']
Read the CSV files from the specified folder and write them into a single file.
Complete Script
The formatted.csv file with the chosen header will contain the final output.
Conclusion
I hope it will be a great help to you and save you a lot of time. Please follow my page and leave comments on my posts.
Thank you !!