Many-to-many relationship

“A junction table, sometimes also known as a “Bridge Table”, “Join Table”, “Map Table”, or “Link Table”, is a table that contains common fields from two tables. It is on the many side of a one-to-many relationship with the other two tables. Junction tables are employed when dealing with many-to-many relationships in a database.” (http://en.wikipedia.org/wiki/Junction_table)

As described in the definition, mapping tables are used when there is a many-to-many relationship between two tables. In this case, a third mapping table is used to map the relation between those two tables.

But I have seen this in one of my employer products where they were using mapping tables in lot of places. The most common and used was in financial application, where data was very crucial. I’ll explain this with an example for education domain. Let’s take classic example of students and classes. A student can be in only one class and in one class there can be multiple students. Therefore, the relationship between student and class is many-to-one. Or from other side relationship between class and student is one-to-many.

So, ideally we will have a table structure with two tables say, STUDENT and CLASS. CLASS will have a PK say CLASSNUM and fields for other class details. Similarly STUDENT will have a PK say STUDENTNUM and fields for other student details. One more field that STUDENT will have is FK to CLASSNUM field.

This is the most obvious design of the scenario. But in the above mentioned application this was not the case. We had a mapping table between STUDENT and CLASS. Yes! You have read it correctly. We had a STUDENTCLASSDETAILS table between STUDENT and CLASS, having a composite key of STUDENTNUM and CLASSNUM. But what is the reason for a mapping table here? What I have heard, (since I was not the designer of that DB at that time) that this mapping table has been introduced to improve performance and resolve locking issues.

Well, let’s discuss how it can improve performance. Now consider the scenario where there are no relationships maintained in the database i.e. no referential integrity (Read my other blog on not having referential integrity – http://mjawaid.wordpress.com/2009/04/01/referential-integrity). So there are no FKs in any table. Now we have a STUDENT with STUDENTNUM (PK), other fields, and CLASSNUM (note that it is not FK, and is not indexed either). CLASS has CLASSNUM (PK), and other fields. STUDENTCLASSDETAILS has STUDENTNUM and CLASSNUM as PK i.e. composite key and these are not FKs.

Now if two users are performing any operation on same record then first transaction locks some records in the STUDENT table then another transaction will not be able to read that students information. In this case the STUDENTCLASSDETAILS will allow us to read the information, at least student number and his class, since this information is most accessible. This way the second transaction will not wait for the other transaction resulting in improved performance and first transaction will acquire the lock successfully.

Now let’s see what are the problems with this approach?

First is the maintenance. When moving student from one class to another (when promoting to next level) two tables have to be updated, resulting in performance hit. Second, mapping table allows student to be inserted in multiple classes and we have witnessed this, resulting in loss of data integrity which is very crucial in financial applications and one can lost a big amount of money just due to silly mistakes. I’ll quote a statement from another forum here:

The designer of an application has a fiduciary responsibility to his employer/client and needs to ensure that data is as acurate as possible. To not enforce referential integrity is to tempt fate. Employees get fired for building systems that contain bad data leading to bad business decisions. Consultants get sued.

Pasted from <http://www.access-programmers.co.uk/forums/archive/index.php/t-33531.html>


Although the person made a comment on not having referential integrity but the underlined statement has the crux of the topic. So the conclusion is when designing an application or database, also consider other scenarios and pros and cons of the approach being adopted, not only one scenario. In other words if one approach is solving your problem then also consider what other problems we can face due to it, and prepare for those as well.

One Response

Add a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.