It is very common that many software applications today allow the reports be stored in Excel format, and Excel doesn’t support large volume of records. Also, the ability to do data analysis (data comparison, validation, and etc) is limited in Excel. In order to overcome these limitations, this article is going to explain how to import multiple Excel files into a single SQL database table and perform data analysis easily and quickly.
An example scenario:
A company wants to generate monthly payroll, using daily attendance files. Also, the company wants to generate monthly or yearly reports using daily report data.
SQL Server has an Import wizard, but it can import only a single file into a single SQL table. This article covers various steps involved in importing multiple files into a single SQL table.
Pre Requisites:
1.Filename should be in mm-dd-yyyy-xxx.xlsx format.
2.All files should be stored in same folder.
3.Number of columns should be same for all files.
4.Header must be same for all files (for example, January month has 31 days; so there should be 31 files, and Header or column names must be same for all 31 files)
Optional
1.Filename can be included in a separate column.
2.Serial number can be included.
Advantages:
- Accurate and reliable result.
- Less time consumption
- Customized time frame can be selected for the import.
Now, the actual steps for importing multiple files are explained below with necessary screenshots:
The following section lists all the DML and DDL statements used:
1.Creating staging and processing tables.
2.Creating a stored procedure.
3.Executing the stored procedure
4.Running the Query (1,2,3,4)
/*—————————————————————–
1a. Create a table for getting file names
—————————————————————–*/
if exists (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '[FileNames]') DROP TABLE [FileNames]; CREATE TABLE [dbo].[FileNames]( [ID] [int] IDENTITY(1,1) NOT NULL, [name] [varchar](200) NULL ) ON [PRIMARY] if exists (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '[MultipleXLtoSQL_stage]') DROP TABLE MultipleXLtoSQL_stage; CREATE TABLE [dbo].[MultipleXLtoSQL_stage]( [Sno] [float] NULL, [EmpID] [float] NULL, [EmpName] [nvarchar](255) NULL, [Checkin] [datetime] NULL, [Checkout] [datetime] NULL, [Working hours] [float] NULL, [Status] [nvarchar](255) NULL ) ON [PRIMARY] if exists (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '[MultipleXLtoSQL]') DROP TABLE MultipleXLtoSQL; CREATE TABLE [dbo].[MultipleXLtoSQL]( [Sno] [float] NULL, [EmpID] [float] NULL, [EmpName] [nvarchar](255) NULL, [Checkin] [datetime] NULL, [Checkout] [datetime] NULL, [Working hours] [float] NULL, [Status] [nvarchar](255) NULL, [File_name] [varchar](50) NULL, [date] [date] NULL ) ON [PRIMARY]
/*—————————————————————–
2a. Create a Stored Procedure for getting the file count
—————————————————————–*/
IF EXISTS (SELECT
*
FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[usp_ImportMultipleFiles]’) AND type IN (N’P’, N’PC’))
DROP PROCEDURE [dbo].[usp_ImportMultipleFiles]
CREATE procedure [dbo].[usp_ImportMultipleFiles] @filepath varchar(500),
@pattern varchar(100), @TableName varchar(128) = null
as
SET QUOTED_IDENTIFIER OFF
declare @query varchar(1000)
declare @max1 int
declare @count1 int
Declare @filename varchar(100)
SET @count1 = 0
DROP TABLE [FileNames]
create table #x (name varchar(200))
SET @query = ‘master.dbo.xp_cmdshell “dir ‘ + @filepath + @pattern + ‘ /b”‘
INSERT #x EXEC (@query)
DELETE FROM #x
WHERE name IS NULL
SELECT
IDENTITY(int, 1, 1) AS ID,
name INTO [FileNames]
FROM #x
DROP TABLE #x
/*—————————————————————————-
2b. Create a Stored Procedure for inserting the excel files one by one
—————————————————————————-*/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Article_InsertMultiplexlFile]') AND type IN (N'P', N'PC')) DROP PROCEDURE [Article_InsertMultiplexlFile]; Create procedure [dbo].[Article_InsertMultiplexlFile] (@filepath varchar(max),@table_name varchar(50)=NULL) as Begin declare @v_filepath varchar(max),@v_delete varchar(500), @v_closebracket varchar(10),@max1 int,@count1 int,@filename varchar(100),@v_message varchar(50),@v_Date date, @v_filename varchar(48),@v_sheetname varchar(500) ; SET @count1 = 0; SET @v_closebracket = ')'; SET @v_sheetname = 'Sheet1' EXEC usp_ImportMultipleFiles @filepath, '*.x*' SET @max1 = (SELECT MAX(ID) FROM [FileNames]) --print @max1 --print @count1 While @count1 <= @max1 Begin SET @count1 = @count1 + 1 SET @filename = NULL SET @filename = (SELECT name FROM [FileNames] WHERE [id] = @count1) IF @filename is not null Begin Begin Try SET @v_filepath = 'INSERT INTO ' + @table_name + ' SELECT * FROM OPENROWSET(' + '''' + 'Microsoft.ACE.OLEDB.12.0' + '''' + ',' + '''' + 'Excel 12.0;Database=' + @filepath + @filename + '''' + ',' + '''' + 'SELECT * FROM [' + @v_sheetname + '$]''' + @v_closebracket EXEC (@v_filepath) End Try BEGIN CATCH SELECT 'ERROR WITH Filename @filename = ' + @filename + ' ' + ERROR_MESSAGE() AS Error_Description END CATCH End --End if SET @v_date = CAST(SUBSTRING(@filename, 1, 10) AS date) INSERT INTO MultipleXLtoSQL ([Date], [Sno], [EmpID], [EmpName], [Checkin], [Checkout], [Working hours], [Status], [File_name]) SELECT @v_date, [Sno], [EmpID], [EmpName], [Checkin], [Checkout], [Working hours], [Status], @filename FROM MultipleXLtoSQL_stage Truncate table MultipleXLtoSQL_stage End --While End
/*—————————————————————————-
- Execute the Stored Procedure (Give the folder path)
—————————————————————————-*/
EXEC [dbo].[Article_InsertMultiplexlFile] 'D:\MultipleExcel2SQL\ArticleInputFiles\', 'MultipleXLtoSQL_stage'
/*—————————————————————————-
4a. To see how many records were imported and from which file (Query 1)
—————————————————————————-*/
SELECT FILE_NAME, COUNT(*) No_of_Records FROM MultipleXLtoSQL GROUP BY FILE_NAME;
/*—————————————————————————-
4b. To see all the records from table MultipleXLtoSQL (Query 2)
—————————————————————————-*/
SELECT [Date], [Sno], [EmpID], [EmpName], [Working hours], [Status], [File_name] FROM MultipleXLtoSQL;
/*—————————————————————————-
4c. To see total number of present and absent days (Query 3)
—————————————————————————-*/
SELECT Empname, COUNT(status) PRESENT_DAYS, 0 ABSENT_DAYS FROM MultipleXLtoSQL WHERE status = 'Present' GROUP BY Empname UNION SELECT Empname, 0, COUNT(status) ABSENT_DAYS FROM MultipleXLtoSQL WHERE status = 'Absent' GROUP BY Empname
/*—————————————————————————-
4d. To see the details of a selected employee (Query 4)
—————————————————————————-*/
SELECT [Date], [Sno], [EmpID], [EmpName], [Working hours], [Status], [File_name] FROM MultipleXLtoSQL WHERE Empname = 'A'
/*—————————————————————————-
–To access the files in folders
—————————————————————————-*/
/* SET ANSI_PADDING OFF GO EXEC sp_configure 'show advanced options', 1 reconfigure with override GO EXEC sp_configure 'xp_cmdshell', 1 reconfigure with override GO */
Summary
As SQL DB server doesn’t allow multiple Excel files imported into a single database table, as explained above, we can build a reliable, simple, and custom solution to automatically import multiple Excel files into a single database table which can be used for reporting or data analysis.