This is one of the challenges taken from BeyondRelational.
Below is the problem statement.
The challenge is all about merging the data based on certain conditions.
Sample Input Data
ID             Name
-------------  ----------------------
1              Deepak Kumar Goyal
2              Niladri Biswas
2              Pratik Shaw
3              Sumi Girijan
3              Suresh Beldar
3              Jeeva Baby
The challenge is that if the Id’s are exactly two in number (e.g. count of ID = 2 is exactly two ) then the names should be concatenated with 'OR'. When Ids are more than two (e.g. Id # 3 has a count of 3) , the values should be concatenated with 'AND'
Expected Output
ID      Name
------- --------------------------------------------------
1       Deepak Kumar Goyal
2       Niladri Biswas OR Pratik Shaw
3       Sumi Girijan AND Suresh Beldar AND Jeeva Baby
Script to generate the data
DECLARE @t TABLE(
	ID INT,
	NAME VARCHAR(MAX)
)
INSERT INTO @t (ID,[NAME]) SELECT 1, 'Deepak Kumar Goyal'
INSERT INTO @t (ID,[NAME]) SELECT 2, 'Niladri Biswas'
INSERT INTO @t (ID,[NAME]) SELECT 2, 'Pratik Shaw'
INSERT INTO @t (ID,[NAME]) SELECT 3, 'Sumi Girijan'
INSERT INTO @t (ID,[NAME]) SELECT 3, 'Suresh Beldar'
INSERT INTO @t (ID,[NAME]) SELECT 3, 'Jeeva Baby'
Also, below are the other Restrictions or notes that should be considered.
Solution
- The CTE 'DistinctNames' gets the distinct Ids and Names excluding the Names that are blank or NULL.
- Next what I do is get the count of all the distinct Names by Id which will help me to figure out whether to add AND or OR in between Names.
- Then, we add row number to the data which is obtained by joining the above two data sets.
- The last CTE is the recursive CTE which is used to traverse data row by row for each Id.
;WITH DistinctNames
AS
(
      SELECT DISTINCT t.ID,T.NAME
      FROM @t t
      WHERE t.NAME IS NOT NULL AND ISNULL(t.NAME,'') <> ''                   
),
GetTotals
AS
(
      SELECT ID,COUNT(Distinct NAME) AS total FROM DistinctNames
      GROUP BY ID
),
Result
AS
(    
      SELECT d.ID,d.NAME, g.total, ROW_NUMBER() OVER(PARTITION BY d.ID ORDER BY d.ID) as row_num
      FROM DistinctNames d
            INNER JOIN GetTotals g
                  ON d.ID = g.ID
),
Result2
AS
(
      SELECT ID, NAME, row_num
      FROM Result
      WHERE row_num = 1
      UNION ALL
      SELECT r2.ID,
               CASE WHEN r.total = 2 THEN r2.NAME + ' AND '+r.NAME     
                        WHEN r.total > 2 THEN r2.NAME + ' OR '+r.NAME
                  END as NAME,
                  r.row_num                                      
      FROM Result2 r2
            INNER JOIN Result r
                  ON r2.ID = r.ID AND (r2.row_num + 1) =  r.row_num
)
SELECT r.ID,r.NAME FROM Result2 r
WHERE r.row_num =
(
      SELECT MAX(r2.row_num) FROM Result2 r2 WHERE r2.ID = r.ID
)
ORDER BY ID
You can find other solutions for the same problem at BeyondRelational
 
No comments:
Post a Comment