-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create GET /ideas?filter[highlyRated]=bottomRateLimit #56
Conversation
This is a great work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work @agatatalita.
There are a few minor proposed changes which we could discuss.
I'd be happy to implement those which you'd approve, if you're not motivated to do so.
controllers/ideas.js
Outdated
async function getIdeasHighlyRated(req, res, next) { | ||
try { | ||
// gather data | ||
const { page: { offset = 0, limit = 5 } = { } } = req.query; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should definitely put these values to a config (perhaps different PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I am comfortable with doing in some pull request, maybe when ideas are done (for this sprint)?
apidoc.raml
Outdated
@@ -373,6 +373,7 @@ types: | |||
- new ideas: `?sort=-created` | |||
- ideas with provided creators: `?filter[creators]=username0,username1,username2` | |||
- ideas commented by provided users: `?filter[commentedBy]=username0,username1,username2` | |||
- highly rated ideas with minimum rate parameter: `?filter[highlyRated]=bottomValueLimit` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
What is the use case for
bottomValueLimit
limit? We sort the output by sum of votes anyways? -
Could i.e.
highlyVoted
be clearer naming thanhighlyRated
? And talk about sum of votes rather thanrate
which doesn't carry any specific meaning?
Regarding point 2, I suggest using highlyVoted
instead of highlyRated
, and voteSum
instead of rate
throughout this PR.
If @agatatalita feels in favor of these changes but not motivated to do them, i'll do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the first question. Is it a separate question?
I don't understand the second question.
Is the naming part separate from the rest? If so, we can leave it and I will change the naming for the one you suggested and focus on the two first @mrkvon .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the first question. Is it a separate question?
I don't understand the second question.
Edited the comment to be clearer.
I want to understand a use case for different bottomValueLimit
. I probably need an example when it would be useful to have the limit.
Why?
- We will always get the highest voted ideas on page 1. It's up to user/frontend how far into the pages they want to go.
- The limit different than 0 doesn't have a constant impact (i.e. at the beginning 5 can be high, but later (when everybody uses ditup and votes) 100 can be not enough)
Is the naming part separate from the rest?
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We would want to list 'the most popular' ideas which would be those with minimum 100 votes. I understand it would give us the similar result to 'get 20 first ideas most voted' but maybe we really want to list just 100+ ideas or ideas with positive votes or ideas which got more interest than ours.
models/idea/index.js
Outdated
// set bottom limit of rate | ||
FILTER rate >= @rateBottomLimit | ||
// find creator | ||
LET c = (DOCUMENT(id.creator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for self: we should DRY this idea enriching. It's repeated over and over in queries for ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree here. Sometimes it is nice to have all the code in one place where you can look at it and not look for it in many files.
models/idea/index.js
Outdated
// get sum of each idea's votes values | ||
AGGREGATE rate = SUM(vote.value) | ||
// set bottom limit of rate | ||
FILTER rate >= @rateBottomLimit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not redundant, if we sort by rate anyways? See the comment above (at .raml file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We sort by rate but we may not be willing to show ideas with 0 or lower than zero rates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May we actually want to restrict people from seeing low voted ideas? Why?
Well, maybe it doesn't matter. I might be just bikeshedding.
Might be a cliche, but: we should make things as simple as possible, but not simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is sometimes not necessary to discuss things like this and just let them be if it is really one documented variable which if clear what is it for.
And dry for being dry or simple for being simple, not for saving time and energy for a different work.
Is there any case this variable can be harmful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I totally understand your point @agatatalita. I'm nitpicking here.
I don't understand the strong resistance though.
I believe if something can be simplified, it should be. I'm willing to make the changes myself, only if I get approval for them.
Unless we leak some information, the read-only can't be harmful.
The only meaningful value is 0, unless we want to add a user interface for this, which I doubt. (Or do we?) And then it limits our ability to see negatively voted ideas in this list.
If you're still not convinced, I'm leaving this discussion and we keep it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave a few examples when it can be useful. Why would I be writing it again later if I was able to put the parameter now or that was my imagination of the use case?
Generally also the name highlyVotedIdeas can be confusing. Because we are actually listing all of the ideas, not just highly voted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name highlyVotedIdeas can be confusing
This is true, actually. I haven't thought about this. Maybe it should be ?sort=-voteSum
or similar. Then we could also have ?sort=voteSum
to find the absolutely worst ideas. :)
Maybe also random should be ?sort=random
.
Not sure if this matters. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all of this filtering and sorting will be rewritten at some point (because it is quite bothering to be not able to connect sort with filters and I believe people may expect this - so there will be one filtering sorting function or some function for creating queries probably?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agatatalita I apologize for the fuss around this. I realize it really is a minor thing. It just takes time for it to fit my rigid mindset. :)
I don't know what future holds, but indeed some nice generalizing refactor may be beneficial at some point.
serializers/ideas.js
Outdated
@@ -7,7 +7,8 @@ const config = require(path.resolve('./config')); | |||
|
|||
const ideaSerializer = new Serializer('ideas', { | |||
id: 'id', | |||
attributes: ['title', 'detail', 'created', 'creator', 'ideaTags'], | |||
// TODO where to put rate in this serialization | |||
attributes: ['title', 'detail', 'created', 'creator', 'ideaTags', 'rate'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an impression that rate
or whatever we want sum of votes to be called, should belong to meta
. See the related issue #59.
It can be implemented with jsonapi-serializer
something like this:
{
dataMeta {
voteSum(record, current) {
return current.rate;
}
}
}
The examples are here: https://github.com/SeyZ/jsonapi-serializer/blob/master/test/serializer.js#L254
Are you opinionated on this, @agatatalita ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I actually wanted to include it to meta.
models/idea/index.js
Outdated
FOR vote IN votes | ||
FILTER idea._id == vote._to | ||
// group by idea id | ||
COLLECT id = idea |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming idea id
may be confusing. How about i
or collectedIdea
or renaming idea
to i
above and using idea
here? Just id
is confusing with actual id of something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, to be changed.
models/idea/index.js
Outdated
static async findHighlyRated(rateBottomLimit, { offset, limit }) { | ||
const query = ` | ||
FOR idea IN ideas | ||
FOR vote IN votes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way of writing this may be
FOR idea IN ideas
LET ideaVotes = (FOR v, e IN 1..1 INBOUND idea, INBOUND votes RETURN e) // graph traversal and assigning to ideaVotes variable
LET voteSum = SUM(ideaVotes[*].value) // simplifying aggregation
...
I'd really like to understand which of these would have better performance, if any difference. Now I simply don't know.
This feels a bit more efficient to me, but it's hard to judge without the internal knowledge of Arango. Perhaps we could ask on stackoverflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is a need for performance tests - I will happily test it. Also may ask the question on the stack - after tests. Probably also you can use 'explain' for AQL and this may give some information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I also believe the current query will filter out all the ideas which haven't received any votes. So at least a subset of my proposed changes should be applied.
i.e. not FOR idea IN ideas FOR vote IN votes...
but FOR idea IN ideas LET ideaVotes=(*subquery*)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'..depends on the one hand which queries you want to perform and the amount of data you have. The general rule of thumb is, that if you do not know the depth of you search, then a graph traversals are likely to be faster than joins. And if you have a known or fixed depth joins are normally faster.' from Arango slack regarding JOINS vs. GRAPHS
And this 'LEFT JOIN' case can be actualy easier to do with graph, I will look at it.
Why should I be not comfortable with eventually merging this to master, @agatatalita ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agatatalita if you approve my fixes, please squash and merge to production. :) I'm happy with this PR as it is.
* create GET /ideas?filter[highlyRated]=bottomRateLimit * renaming rate => voteSum, putting voteSum to meta in response * rename + fix in query (we were ignoring ideas with 0 votes before)
6ebc170
to
c83d42f
Compare
🎉 |
AQL query takes me 90% of the time but hopefully, it will get faster.
Hope it will not make much mess. I created it from commented by and was rebasing while making changes on other branches, still got some conflicts.