identifying broken numeric sequences using SQL – Part 1 (Islands)

Sometimes a sequential chain of numbers gets broken, and sometimes you may want to identify where that break in sequence happens. Well if you’re looking for solution, here is the quick and dirty on how to get it done. In this 2 part series we’ll explore how to identify gaps and islands in your data.

A data island refers to a block of sequential numbers separated by a gap.

DataIsland1_01

Here is some SQL to get our scenario set up.

USE tempdb
GO
if object_id('tempdb..[GapIslandDemo]','U') is not null
     drop table [GapIslandDemo]

CREATE TABLE GapIslandDemo (mov_number INT NOT NULL, SeqNo INT NOT NULL);

INSERT INTO dbo.GapIslandDemo
SELECT 1, 1             
UNION ALL SELECT 1, 2  
UNION ALL SELECT 1, 3
UNION ALL SELECT 1, 5
UNION ALL SELECT 1, 6
UNION ALL SELECT 1, 7
UNION ALL SELECT 1, 8
UNION ALL SELECT 1, 9
UNION ALL SELECT 1, 10
UNION ALL SELECT 1, 12
UNION ALL SELECT 1, 20
UNION ALL SELECT 1, 21
UNION ALL SELECT 1, 25
UNION ALL SELECT 1, 26
UNION ALL SELECT 2, 2  
UNION ALL SELECT 2, 3
UNION ALL SELECT 2, 4
UNION ALL SELECT 2, 5
UNION ALL SELECT 2, 6
UNION ALL SELECT 2, 7
UNION ALL SELECT 2, 8
UNION ALL SELECT 2, 9
UNION ALL SELECT 2, 10
UNION ALL SELECT 2, 12
UNION ALL SELECT 2, 20
UNION ALL SELECT 2, 21
UNION ALL SELECT 2, 25
UNION ALL SELECT 2, 26;  

SELECT mov_number, SeqNo 
FROM dbo.GapIslandDemo 
ORDER BY mov_number,SeqNo;
 
--Base query used in derived table
SELECT mov_number
    ,SeqNo
    ,rn = SeqNo - ROW_NUMBER() OVER (
        PARTITION BY mov_number ORDER BY SeqNo
        )
    ,ROW_NUMBER() OVER (
        PARTITION BY mov_number ORDER BY SeqNo
        ) [RN_Partition]
FROM dbo.GapIslandDemo;

We’ll use the ROW_NUMBER() function to create an incrementing column (RN_Partition), which we can then use as a comparison value to identify our data islands.

DataIsland1_02

--Island
SELECT mov_number
    ,StartSeqNo = MIN(SeqNo)
    ,EndSeqNo = MAX(SeqNo)
    ,'Islands' [Topic]
FROM (
    SELECT mov_number
        ,SeqNo
        ,rn = SeqNo - ROW_NUMBER() OVER (
            PARTITION BY mov_number ORDER BY SeqNo
            )
        ,ROW_NUMBER() OVER (
            PARTITION BY mov_number ORDER BY SeqNo
            ) [RN_Partition]
    FROM dbo.GapIslandDemo
    ) a
GROUP BY mov_number
    ,rn
ORDER BY mov_number
    ,rn;

DataIsland1_03

Posted in SQL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s