Skip to content
This repository has been archived by the owner on Jun 14, 2024. It is now read-only.

Refactor Content.rec(..) function to support tail-optimisation #402

Closed
wants to merge 10 commits into from

Conversation

dmytroDragan
Copy link

What is the context for this pull request?

What changes were proposed in this pull request?

Refactor Content.rec(..) function to support tail-optimisation

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests should be enough, because it is just refactoring without changing functionality.

Tracking Issue: microsoft#400
Parent Issue: N/A
Dependencies: N/A
What changes were proposed in this pull request?
Refactor Content.rec(..) function to support tail-optimisation

Does this PR introduce any user-facing change?
No

How was this patch tested?
Existing unit tests should be enough, because it is just refactoring without changing functionality.
@ghost
Copy link

ghost commented Mar 30, 2021

CLA assistant check
All CLA requirements met.

Copy link
Contributor

@andrei-ionescu andrei-ionescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. A couple of things to address:

  1. Please use scalafmt for formatting - see https://github.com/microsoft/hyperspace/blob/master/docs/coding-guidelines/scala-coding-style.md.
  2. Please add some unit tests on the rec method.

@dmytroDragan
Copy link
Author

Hi @andrei-ionescu ,

  1. Sure (might make sense to add also scalafmt version for alignment)
  2. There are already tests for Content class like "Content.files api lists all files from Content object" which rely on it, but I could move rec function to Content object and add more.

- Move rec function outside Content case class.
- Rename it to 'recFilesApply'
- fixed formatting

Cover recFilesApply with tests
* @param directory Root directory
* @param func function which would apply to current prefix and file
* @tparam T
* @return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comment for this? e.g.
Result sequence of the given func for each file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

func: (FileInfo, Path) => T): Seq[T] = {
@tailrec
def recAcc[A](
dirMap: List[(Path, Seq[Directory])],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if scalafmt is set properly?

Copy link
Author

@dmytroDragan dmytroDragan Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it works strangely: some scalafmt versions are ignored by idea (1.5, 2.0.0) , while others are too aggressive - look on latest commit with 2.7.5 scalafmt.
Am I missing anything?

acc: Seq[A] = Seq.empty): Seq[A] = {
dirMap match {
case Nil => acc
case (curPrefixPath, curDirs) :: xs =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why it's xs?

Copy link
Author

@dmytroDragan dmytroDragan Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have started with common unapplied list x::xs and simplified the head later)
xs is plural for x.

I will rename it to otherDirs for readability.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -240,6 +240,47 @@ class IndexLogEntryTest extends SparkFunSuite with SQLHelper with BeforeAndAfter
assert(actual.sourceFilesSizeInBytes == 200L)
}

test("Content.recFilesApply apply function to all files ") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@sezruby sezruby requested a review from imback82 April 1, 2021 20:18
@dmytroDragan dmytroDragan dismissed a stale review via add47fd April 2, 2021 08:25
@sezruby
Copy link
Collaborator

sezruby commented Apr 9, 2021

@dmytroDragan Sorry but, could you revert unnecessary changes from scalafmt to reduce the diff? It can be handled with a different PR (would be appreciated if you could push a new PR!), so that we could review & merge only required change for tailrec.

@dmytroDragan
Copy link
Author

dmytroDragan commented Apr 9, 2021

Hi @sezruby!
I can revert changes from scalafmt, but it makes a usage of scalafmt useless in first place.

I`m using 2.7.5 (latest) scalamft with dev conf:

# The following configs are taken from https://github.com/apache/spark/blob/master/dev/.scalafmt.conf
align = none
align.openParenDefnSite = false
align.openParenCallSite = false
align.tokens = []
optIn = {
  configStyleArguments = false
}
danglingParentheses = false
docstrings = JavaDoc
maxColumn = 98

# The following are specific to Hyperspace.
importSelectors = singleLine

If anyone gets different result for this files (like old once), please reach out to me and we could try to figure out.

I could make a PR with only scalafmt of this two files, so my PR will cover only my changes after merge. What do you think?

@andrei-ionescu
Copy link
Contributor

@dmytroDragan, Here is my scalafmt setup in IntelliJ Idea (I have 1.5.1 version):

hyperspace_scalafmt

I usually do Command+Shift+Alt+L to reformat the file and use the following options in the dialog box:

hyperspace_reformat

The formatting is correct, the only annoying thing is that it order the functions by name too. I will be great if you could double check the diffs before pushing the changes.

@dmytroDragan
Copy link
Author

Thank you @andrei-ionescu. I have rechecked my configuration and applied result formatting.
I have created PR with just scalafmt formated code for this 2 files, so reviewing it could be easier:
#412

}
}

object Content {

val createFileInfoWithPrefix: (FileInfo, Path) => FileInfo = (f, prefix) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just impl. in fileInfos - there's no other use case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I`m just not keen of formatted:

      (
          f,
          prefix) =>
        FileInfo(new Path(prefix, f.name).toString, f.size, f.modifiedTime, f.id)).toSet

see: https://github.com/microsoft/hyperspace/pull/412/files#diff-5c50f07a41423f4cb5bfe60ac7c30b9c534c4bfc71eeb03d43040047a3994e83

Moving it to separate function makes it more readable.

@dmytroDragan
Copy link
Author

Th problem was with 1.5.1 vs 2.7.5 scalafmt.
Fixed version is done as new PR:
#417

@dmytroDragan dmytroDragan deleted the issue400 branch April 14, 2021 12:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants