Why does CodeQL use a relational database instead of a graph database? #10385
Replies: 2 comments
-
CodeQL certainly doesn't compile to SQL anymore. I've heard anecdotal stories about the issues with compiling to SQL in the past. One of the issues is that regular SQL tend to not have too many recursive queries, and thus SQL engines tend to not be super efficient in their handling of those. However, recursive queries are the bread and butter of CodeQL! So we've spent a lot of engineering effort on optimizing such queries. In fact, very few of us were actually working on QL back in 2007, so it's difficult to say why a relational model was chosen. A couple of reasons come to mind, though:
I am not aware of the ProgQuery work, but it certainly looks interesting :) |
Beta Was this translation helpful? Give feedback.
-
It appears CodeQL uses a relational database internally and compiles QL to SQL to query it:
(.QL: Object-Oriented Queries Made Easy, 2007)
Is there a reason why a relational database was chosen over a graph database? Does this have historical reasons since in ~2007 there were not that many graph database systems available?
Possibly also interesting is this paper about a tool called "ProgQuery" built based on the graph database Neo4j where the authors compared it with (an older version of) CodeQL and suggest that it can outperform CodeQL in certain cases.
(I hope it is fine that I shared the link to that paper here; due to how widely CodeQL has already been adopted I assume that tool is not a direct competitor.)
No worries if you don't want to share the details in case they are considered confidential / secret.
Beta Was this translation helpful? Give feedback.
All reactions