-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
probable cves table added #789
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #789 +/- ##
==========================================
+ Coverage 63.43% 63.62% +0.19%
==========================================
Files 70 70
Lines 5078 5105 +27
==========================================
+ Hits 3221 3248 +27
Misses 1857 1857
Continue to review full report at Codecov.
|
@sivaavkd Your image is available in the registry: |
Jenkins is not happy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am puzzled to be honest. It looks like we are trying to use RDBMS here, but as a document database. I.e. we are completely ignoring relationships.
Is this something just for the POC and we know that we will likely need to do it differently later? I.e. do we plan to drop the table in post-POC phase?
package = Column(String(500), nullable=False) | ||
cause_type = Column(String(50), nullable=False) | ||
issue_date = Column(DateTime, nullable=True) | ||
issue_url = Column(JSONB, nullable=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why JSONB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it is used on other places as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be more than one issues/prs/commits - that can cause a probable CVE. And the model will send it as list/json. We will save it as JSON in the table and show as comma separated list in the UI. And based on what I read so far, JSONB is better managed internally by postgres than JSON.
__tablename__ = "probable_cves" | ||
|
||
id = Column(Integer, autoincrement=True, primary_key=True) | ||
ecosystem = Column(String(50), nullable=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this if this is just for Golang?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not going to do it for other ecosystems anytime soon. For now, it will always be "Golang". However, we have the plan to expand it in future for other ecosystems based on how the model behaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a good idea to have it there as a string, without any constraints? Wouldn't be better to have for example enum for ecosystem names? I don't have enough context here to say what's the best approach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
We dont have a way to pull package name yet, currently it is just a placeholder. Also, most of the packages that we will get from probable cves, will be very very new and hence, may be unknown to our system as of that time. So, having a relation may not be right thing, unless we setup a way to ingest unknowns first before inserting into this table. And table schema may change a little, but not going to be dropped after poc. |
Right, this is not related to ingestion at all. So I am not saying we should have relationships with the existing ingestion tables. Things went sideways with this database long long time ago when we started putting more and more ingestion-irrelevant stuff into it. And this seems like a continuation of the trend. Since this is worker repository, what is/will be the relationship of workers to the new probable-CVE use case? |
Basically my question is: why should workers own probable-CVE database? |
Ah got your question. Sure, for long term, worker may not be the place to own this. This may become a stand alone functionality depending upon how useful this will be. For now, we are putting in this repo because the other RDS tables reside here. And regarding enum, it is a great suggestion. I will work on it. |
[test] |
No description provided.