I am building an application that is using JSON / XML files to persist data. This is why I indicated “outside of SQL” in the title.
I understand one benefit of join tables is it makes querying easier with SQL syntax. Since I am using JSON as my storage, I do not have that benefit.
But are there any other benefits when using a separate join table when expressing a many-to-many relationship? The exact expression I want to express is one entity’s dependency on another. I could do this by just having a “dependencies” field, which would be an array of the IDs of the dependencies.
This approach seems simpler to me than a separate table / entity to track the relation. Am I missing something?
Feel free to ask for more context.
The real primary benefit of storing your relationships in a separate place is that it becomes a point of entry for scans or alterations instead of scanning all entries of one of the larger entity types. For example, “how many users have favorited movie X” is a query on one smaller table (and likely much better optimized on modern processor architectures) vs across all favorites of all users. And “movie x2 is deleted so let’s remove all references to it” is again a single table to alter.
Another benefit regardless of language is normalization. You can keep your entities distinct, and can operate on only one of either. This matters a lot more the more relationships you have between instances of both entities. You could get away with your json array containing IDs of movies rather than storing the joins separately, but that still loses for efficiency when compared to a third relationship table.
The biggest win for design is normalization. Store entities separately and updates or scans will require significantly less rewriting. And there are degrees of it, each with benefits and trade-offs.