I recently had to import a list of names from an excel spreadsheet into a database--a faily routine task. One of the challenges
of this particular import was that in the spreadsheet, the names were stored in a last-name-first format. IE: "DOE, JOHN C". In
the database, however, there were separate fields for first, middle and last name.
In the past, when confronted with this sort of problem, I had written a simple VB.NET program to massage the data prior to importing
it into SQL Server. This time, I decided to see if I could split the name column into its first, middle and last components using
SQL string functions. It turns out that I could, and I share the resulting SQL with you here:
The INTMIN function is used to determine the smallest of two integers.
CREATE function dbo.INTMIN(@iValue1 int,@iValue2 int) returns int
as
begin
declare @iResult int
if @iValue1<@iValue2
set @iResult= @iValue1
else
set @iResult= @iValue2
return @iResult
end
The Capitalize function is used to capitalize a name (IE: "JOE" becomes "Joe")
create function dbo.Capitalize(@name varchar(255)) returns varchar(255)
as
begin
declare @result varchar(50)
set @result=upper(substring(@name,1,1)) +
lower(substring(@name,2,datalength(@name)-1))
return @result
end
The LastNameFromLNFName function returns the last name component of a last-name-first string. (IE: "DOE, JOHN JACOB" becomes "DOE")
create function dbo.LastNameFromLNFName(@name varchar(255)) returns varchar(255)
as
begin
return substring(@name,1,DBO.INTMIN(charindex(' ',@name,1), charindex(',',@name,1))-1)
end
The FirstNameFromLNFName function returns the first name component of a last-name-first string. (IE: "DOE, JOHN JACOB" becomes "JOHN")
create function dbo.FirstNameFromLNFName(@name varchar(255)) returns varchar(255)
as
begin
return substring(@name,
(DBO.INTMIN(charindex(' ',@name,1), charindex(',',@name,1)))+2,
(charindex(' ',@name+' ',DBO.INTMIN(charindex(' ',@name,1), charindex(',',@name,1))+2))
-
((DBO.INTMIN(charindex(' ',@name,1), charindex(',',@name,1)))+2)
)
end
The MiddleNameFromLNFName function returns the middle name component of a last-name-first string. (IE: "DOE, JOHN JACOB" becomes "JACOB")
The function accomodates two-part middle names like "SMITH, MARY PEGGY SUE".
There is a known problem with modifiers such as "II" and "JR". Fortunately I didn't have these in my source data. If I did, I'd probably create a
new function to find and return them, and modifiy the MiddleNameFromLDFName function to not return certain strings ("JR",
"SR", "II", etc.)
create function dbo.MiddleNameFromLNFName(@name varchar(255)) returns varchar(255)
as
begin
return
substring(
@name,
charindex(' ',
@name+' ',
charindex(' ',
@name,
DBO.INTMIN(
charindex(' ',@name,1),
charindex(',',@name,1)
)+1
)+2
),
datalength(@name)-
charindex(' ',
@name,
charindex(' ',
@name,
DBO.INTMIN(
charindex(' ',@name,1),
charindex(',',@name,1)
)+1
)+2
)+1
)
end