Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on challenge #3

Open
yaravind opened this issue May 18, 2020 · 1 comment
Open

Question on challenge #3

yaravind opened this issue May 18, 2020 · 1 comment
Labels
question Further information is requested

Comments

@yaravind
Copy link

yaravind commented May 18, 2020

Thank you for the example. Learned quite a bit! I have following questions if you can help!

  1. Why do we have to sort the department DF? What advantage does it give?
  2. Why to sort in this specific order: 'id, 'assigned_date, 'company_id, 'factory_id' ?
  3. What is the intent of the query in the code? Count of all users whose joined a department on the same day their birth date?
@vitalyte
Copy link
Member

@yaravind you are welcome. I think our detail description in blog could help you in research https://dataengi.com/2019/02/06/spark-data-skew-problem/

@vitalyte vitalyte added the question Further information is requested label May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants