Tuesday, September 6, 2011

Clean up alphanumeric strings by splitting digits and letters into separate columns using T-SQL


This is one of the challenges taken from BeyondRelational.


Below is the problem statement.

The challenge is to read an alphanumeric string and create two output columns one containing all the digits found in the input string and the other with all the letters of the English Alphabet.


Sample Input Data


Id
AlphanumericData
1
I would like to invite my readers to participate in a TSQL Challenge.
2
You will receive 12 records
3
1234567

Expected Output


Id
AlphanumericData
Numbers
Alphabets
1
I would like to invite my readers to participate in a TSQL Challenge.
No Number Found
IwouldliketoinvitemyreaderstoparticipateinaTSQLChallenge
2
You will receive 12 records
12
Youwillreceiverecords
3
1234567
1234567
No Alphabets Found


Script to generate the data


DECLARE @tblAlphanumeric TABLE(Id INT IDENTITY,AlphaNumericData VARCHAR(1000))

INSERT INTO @tblAlphanumeric (AlphaNumericData)
SELECT 'I would like to invite my readers to participate in a TSQL Challenge.'
UNION ALL
SELECT 'You will receive 12 records'
UNION ALL
SELECT '1234567'

SELECT * FROM @tblAlphanumeric

Also, below are the other Restrictions or notes that should be considered.


  • The program should run in SQL SERVER 2005+.
  • Column names should respect the desired output shown
  • Output must be sorted in Ascending order of Id
  • Ignore those records where there is no data
  • If the alphanumeric data does not contain any numeric record, then the Number field should display "No Number Found".
  • If the alphanumeric data does not contain any alphabetic record, then the Alphabets field should display "No Alphabets Found".
  • The program has to be done by a single query and should begin either with a SELECT or WITH statement with no variables, temporary table, table variables permitted.
  • You cannot use RBAR, cursors, loops etc. in your program. 


  • Solution


    ;WITH DATA
    AS
    (
     SELECT Id
       , AlphaNumericData
       , CASE WHEN AlphaNumericData IS NULL
              OR ISNULL(AlphaNumericData,'') = '' THEN 0
            ELSE CHARINDEX(' ',AlphaNumericData,0)
            END AS char_index
       , CASE WHEN AlphaNumericData IS NULL
                OR ISNULL(AlphaNumericData,'') = '' THEN NULL
              WHEN CHARINDEX(' ',AlphaNumericData,0) <> 0
                THEN SUBSTRING(AlphaNumericData,CHARINDEX(' ',AlphaNumericData,0) +1 , LEN(AlphaNumericData))
              END AS truncatedData             
       , CONVERT(VARCHAR(max),
              CASE WHEN CHARINDEX(' ',AlphaNumericData,0) = 0
                    AND ISNULL(AlphaNumericData,'') <> ''
                    AND ISNUMERIC(AlphaNumericData) = 1
                      THEN AlphaNumericData
                    WHEN ISNUMERIC(SUBSTRING(AlphaNumericData,0,CHARINDEX(' ',AlphaNumericData,0))) = 1
                      THEN SUBSTRING(AlphaNumericData,0,CHARINDEX(' ',AlphaNumericData,0))
                   ELSE ''   
                   END
                 )AS Numeric
       , CONVERT(VARCHAR(max),             
              CASE WHEN CHARINDEX(' ',AlphaNumericData,0) = 0
                    AND ISNULL(AlphaNumericData,'') <> ''
                    AND ISNUMERIC(AlphaNumericData) <> 1
                    THEN AlphaNumericData
                   WHEN ISNUMERIC(SUBSTRING(AlphaNumericData,0,CHARINDEX(' ',AlphaNumericData,0))) <> 1
                    THEN SUBSTRING(AlphaNumericData,0,CHARINDEX(' ',AlphaNumericData,0))
                   ELSE ''   
                                                                               END
                 )AS Alphabets
      FROM @tblAlphanumeric
        
    UNION ALL

    SELECT d.Id
       , d.AlphaNumericData
       , CASE WHEN d.truncatedData IS NULL
                                                      OR ISNULL(truncatedData,'') = '' THEN 0
            ELSE CHARINDEX(' ',truncatedData,0)
                                                    END AS char_index
       , CASE WHEN d.truncatedData IS NULL
              OR ISNULL(d.truncatedData,'') = '' THEN NULL
            WHEN CHARINDEX(' ',d.truncatedData,0) <> 0
              THEN SUBSTRING(d.truncatedData,CHARINDEX(' ',d.truncatedData,0) +1 , LEN(d.truncatedData))
                                                    END AS truncatedData 
       , CONVERT(VARCHAR(max),d.Numeric+
                                                                                                    (CASE WHEN CHARINDEX(' ',d.truncatedData,0) = 0
                                                                                                           AND ISNULL(d.truncatedData,'') <> ''
                                                                                                           AND ISNUMERIC(d.truncatedData) = 1
                           THEN d.truncatedData
                                                                                                                              WHEN ISNUMERIC(SUBSTRING(d.truncatedData,0,CHARINDEX(' ',d.truncatedData,0))) = 1
                           THEN SUBSTRING(d.truncatedData,0,CHARINDEX(' ',d.truncatedData,0))
                                                                                                          ELSE ''   
                                                                                                                              END)) AS Numeric 
       , CONVERT(VARCHAR(max),d.Alphabets+
                                                                            (CASE WHEN CHARINDEX(' ',d.truncatedData,0) = 0
                                                                                                       AND ISNULL(d.truncatedData,'') <> ''
                                                                                                       AND ISNUMERIC(d.truncatedData) <> 1
                       THEN d.truncatedData
                      WHEN ISNUMERIC(SUBSTRING(d.truncatedData,0,CHARINDEX(' ',d.truncatedData,0))) <> 1
                       THEN SUBSTRING(d.truncatedData,0,CHARINDEX(' ',d.truncatedData,0))
                                                                                                      ELSE ''   
                                                                                                      END)) AS Alphabets          
    FROM DATA d
    WHERE d.truncatedData IS NOT NULL
        
    )
    select d.Id, d.AlphaNumericData,
         CASE WHEN Numeric = '' THEN 'No Number Found'
                  ELSE Numeric
            END AS Numbers,
         CASE WHEN Alphabets = '' THEN 'No Alphabets Found'
                  ELSE Alphabets
            END AS Alphabets
    from DATA d
    where truncatedData is null
    order by d.Id


    You can find other solutions for the same problem at BeyondRelational

    No comments:

    Post a Comment