Fun with Self-Joins

In the course of doing a data migration yesterday, I was presented with an interesting challenge: I had a junction table creating a many-to-many relationship between people and counties. I needed to create a similar table, but with an extra column: an integer value, starting at zero, which ordered the counties for each person. E.g., for the (person, county) values { (1,1), (1,2), (2,1), (3,1) }, the resulting (person, county, position) values would need to be { (1,1,0), (1,2,1), (2,1,0), (3,1,0) } or { (1,1,1), (1,2,0), (2,1,0), (3,1,0) }. The straightforward approach, in my opinion, is procedural; select the distinct person values from the source table, then do queries for that person’s counties and just increment a variable to get the position value. Unfortunately, this would also be slow, and require more lines of SQL code, which means more possibility for error. So, I wondered if there was a non-procedural way to do it. After a few minutes of thought, I came up with the following:


SELECT j1.person, j1.county, COUNT(*)-1
  FROM junction_table j1
  INNER JOIN junction_table j2
    ON j1.person = j2.person
     AND j2.county < = j1.county   GROUP BY j1.person, j1.county;

It seems to have worked. 🙂

Leave a Reply