Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iterator bug #368

Open
capr opened this issue May 7, 2019 · 28 comments
Open

iterator bug #368

capr opened this issue May 7, 2019 · 28 comments

Comments

@capr
Copy link
Member

capr commented May 7, 2019

Looks like there's a bug with the for iterator: it works with for x in iter_struct do but behaves strangely with for x in func_returning_iter_struct() do in that it sometimes calls func_returning_iter_struct() twice. Test case:

struct iter {n: int}

iter.metamethods.__for = function(self, body)
	return quote
		[ body(`self.n) ]
		[ body(`self.n) ]
	end
end

terra this_is_called_twice(n: int)
	C.printf'this should only be called once\n'
	return iter{n}
end

terra test()
	for x in this_is_called_twice(5) do
		C.printf('%d\n', x)
	end
end
test()
@aiverson
Copy link
Contributor

I'll take a look. I bet I can fix this quickly.

@aiverson
Copy link
Contributor

aiverson commented Jun 6, 2019

It isn't really a bug with the for iterator. The issue is that you are putting self in more than one place.

struct iter {n: int}

iter.metamethods.__for = function(self, body)
	return quote
                var self_ = [self]
		[ body(`self_.n) ]
		[ body(`self_.n) ]
	end
end

terra this_is_called_twice(n: int)
	C.printf'this should only be called once\n'
	return iter{n}
end

terra test()
	for x in this_is_called_twice(5) do
		C.printf('%d\n', x)
	end
end
test()

Should not exhibit the problem.

It is using a splice in more than one spot of the same expression, which is the expression from the for loop. I can modify the code responsible to synthesize a caching variable inside the forlist handler in the compiler, and have a way to do that if you would like.

@aiverson
Copy link
Contributor

aiverson commented Jun 6, 2019

It's unexpected and undocumented behavior, so I consider it a bug, but I think that it could be fixed either by documenting it or by changing the behavior. I've been writing code to change the behavior, but I'm willing to listen if people think it should be left as is and documented.

@capr
Copy link
Member Author

capr commented Jun 7, 2019

I vote strongly for changing the behavior.

@elliottslaughter
Copy link
Member

Not that I disagree, but just to play devil's advocate, why is the double evaluation not intended behavior? With any other sort of macro, you'd expect to get the expression as an argument directly and it would be up to the user to manage whether it should be evaluated once or multiple times.

Are there any scenarios that can't be implemented if we get rid of the ability to do multiple evaluation here?

@OvermindDL1
Copy link

The double evaluation seems entirely intended in the OP's example, it is specifically embedding `self twice into the returned the quote. This matches how every macro system I've ever seen works from lisp to racket to rust to elixir to even C++ template work, and changing that to be single bound implementations means you can no longer finely control when and where it gets expanded. Like if it gets expanded into both conditionals of an if branch then it absolutely should get fully expanded into both.

@capr
Copy link
Member Author

capr commented Jun 10, 2019

nonsense. the number of users that will think that self is a function call will forever be exactly zero.

@OvermindDL1
Copy link

OvermindDL1 commented Jun 10, 2019

Never said it was a function call, it is the passed in ast itself. It's like this:

call_thrice = function(expr, body)
	return quote
		[ body(`expr) ]
		[ body(`expr) ]
		[ body(`expr) ]
	end
end

I would absolutely expect expr to be placed in the resultant AST 3 times. If I didn't want it to be than I'd put the ast in one place instead and reference whatever it ends up expanding to:

call_thrice = function(expr, body)
	return quote
        var e = [ `expr ]
		[ body(`e) ]
		[ body(`e) ]
		[ body(`e) ]
	end
end

It is not a function call, it is putting the ast in position. The self in the argument of the OP iter.metamethods.__for = function(self, body) is not referencing a self object or value, it is an AST node, and that is badly named. I personally would have named such a function's arguments as iter.metamethods.__for = function(self_expr1ast, body_expr0ast), which makes it plainly obvious what it is. (And yes I do use a pattern like this in other language macro work.)

@OvermindDL1
Copy link

OvermindDL1 commented Jun 10, 2019

To be more specific, in the OP function of:

iter.metamethods.__for = function(self, body)
	return quote
		[ body(`self.n) ]
		[ body(`self.n) ]
	end
end

This would more directly be like doing this in a terra-ish pseudo-code where the AST is built manually:

iter.metamethods.__for = function(self, body)
    AST.block([
      AST.call(body, AST.field_get(self, AST.symbol("n"))),
      AST.call(body, AST.field_get(self, AST.symbol("n")))
    ])
end

As can be seen, any and all uses of ` in a terra quote body expands to putting a literal AST node in-place in the defined AST. It is not a value relation, it is quite literally like putting a variable in a list like [n, n], you wouldn't expect it to put just one n, you'd expect it to put both n's, just as you'd expect the ast value to be put in place where it is stated to be put in place.

@capr
Copy link
Member Author

capr commented Jun 10, 2019

right, now self should change to justify some nonsense theory about why it is the way it is. man, i'm outta here.

@elliottslaughter
Copy link
Member

Folks, please calm down.

It seems at least part of the confusion here is that __for is implicitly a macro. (I believe this is also true of __cast and some other metamethods.) Double evaluation is a fact of life with macros. In fact, there are certain patterns you simply couldn't implement without it, like creating your own control flow constructs.

Having said that, I think there is an argument here for __for (and possibly some of the other metamethods) being more restricted than general-purpose macros. For example, I haven't been able to think of any cases where we'd really want double evaluation on the self parameter in __for. @OvermindDL1 if you can think of any please post them.

The main thing I can see is that forcing single-evaluation of self in __for also takes the order of evaluation out of the hands on the user. For example, suppose there was a case where the user needed to run some setup code before evaluating the iterator expression:

iter.metamethods.__for = function(self, body)
	return quote
                -- do some setup....
		[ body(`self.n) ]
		-- rest of loop
	end
end

If you force self to be single-evaluated, you just can't express this pattern.

Are there any cases where forcing self to be evaluated leads to a loss of l-val-ness? That would be the other thing I'd be worried about, but perhaps it's not an issue.

Anyway, these are the sorts of questions we have to answer if we're going to change this. I'm not saying we can't justify any changes to the behavior of the language, but I at least want to do it with a full awareness and evaluation of what we're doing and what what will or will not enable users to do.

@elliottslaughter
Copy link
Member

I should note: one possible solution that would be strictly speaking a breaking change, but would arguably be more intuitive would be to require such functions to be marked as macros:

iter.metamethods.__for = macro(function(self, body)
	return quote
                -- do some setup....
		[ body(`self.n) ]
		-- rest of loop
	end
end)

At least that would make it obvious that it's behaving like a macro. As things are now, this exists in a sort of awkward in-between space where it's not completely obvious how it should behave.

@OvermindDL1
Copy link

For example, I haven't been able to think of any cases where we'd really want double evaluation on the self parameter in __for. @OvermindDL1 if you can think of any please post them.

If __for specifically I can see one thing where multiple evaluation would be valuable and that would be where self is a coroutine that needs to be called on each iteration to update the state of it, however it would make sense (if well documented) that self is an ast value holder of the ast instead of the ast itself (no double expansion then) to act as the for that people from languages like python or lua would expect, though it would make it inconsistent with other macro's (and I'm not a fan of inconsistencies).

A more convoluted use for double evaluation would be a resettable for loop, which is entirely possible and plausible, though not necessarily likely I would think.

The main thing I can see is that forcing single-evaluation of self in __for also takes the order of evaluation out of the hands on the user. For example, suppose there was a case where the user needed to run some setup code before evaluating the iterator expression:

This on the other hand could be quite an issue, you may not want to call it at all if some test fails for example.

Are there any cases where forcing self to be evaluated leads to a loss of l-val-ness? That would be the other thing I'd be worried about, but perhaps it's not an issue.

As stated above, a coroutine version would need to be called repeatedly, though depending on implementation that could be worked around via a function pointer instead (though that's adding a needless indirection that llvm 'might' be able to optimize out in some/most cases).

Anyway, these are the sorts of questions we have to answer if we're going to change this. I'm not saying we can't justify any changes to the behavior of the language, but I at least want to do it with a full awareness and evaluation of what we're doing and what what will or will not enable users to do.

My main thing is to keep consistency, for that reason alone I'd keep macro's always acting as macro's, thus including taking in the full AST at each argument position as well.

but would arguably be more intuitive would be to require such functions to be marked as macros:

That would be quite useful, but then it should be able to work with a non-macro version as well (though if you can craft anything useful using that I'm not sure).

Plus I'd want it to be ubiquitous, all macro's, which definitely forces some needless overhead where it is not at all warranted.

At least that would make it obvious that it's behaving like a macro. As things are now, this exists in a sort of awkward in-between space where it's not completely obvious how it should behave.

I'd think the macro should be part of the name definition then, like instead of __for it should be __formacro and __for should print a deprecated message if assigned to (or just crash, as long as it is noticed early and fast).

@capr
Copy link
Member Author

capr commented Jun 11, 2019

The lengths some people will go to justify their nonsense... now apparently __for too is a problem because... wait for it: coroutines, yeah, that's the ticket! Damn, I'm out of popcorn.

Seriously though, how can we talk about a "breaking change" and "deprecating" something that isn't even documented and was discovered by users by sifting through the test files?

Also, it baffles me that the user perspective is so much ignored here because no user in no language when they see for v in f() do do they ever think that f() might be called twice. It's just common sense.

Luckily, as with most of these warts, there's an easy workaround, in this case var self = self followed by a comment with a link to this thread to explain the weird occurrence.

@OvermindDL1
Copy link

coroutines

Terra is a language for making languages, you can't rule out something just because you will not use it, especially such basic constructs in a huge variety of languages as that.

Also, it baffles me that the user perspective is so much ignored here because no user in no language when they see for v in f() do do they ever think that f() might be called twice. It's just common sense.

Then implement __for correctly for your use-case, you are the one implementing it after all.

@elliottslaughter
Copy link
Member

I'd just like to remind everyone to be respectful. Attacks on other people or belittling their ideas is not ok. You can phrase the substantive parts of your arguments without resorting to that. I do not have time or energy to moderate individual comments, but I will lock the discussion if people can't stay civil.

@elliottslaughter
Copy link
Member

Coroutines are a valid use case for Terra, though they're enough of a niche that I don't mind too much if they require more effort to implement, as long as we aren't making anything outright impossible. So my questions here are mostly around whether there's anything we're going to just make impossible if we change the semantics of __for.

@OvermindDL1 It sounds like you want self to be essentially a closure? Is there any reason that can't be captured by e.g. a function pointer and a heap allocation? I was slightly confused by the AST value holder discussion---are you asking for a code block? That sounds like something beyond what the language would support today.

It sounds to me like closures/coroutines are a case of mutable state, which is something you might want anyway. E.g. it seems completely reasonable for the user to write:

var iter = make_iterator(...)
for x in iter do
  if some_condition then break end
end
for x in iter do
  ...
end

And expect the iterator to resume rather than starting over in the second loop. This means that we can't just naively do var self = self at the top, because you'd end up with different l-values for the different selves. Instead you'd have to do something like var self_ref = &self, except that I think that's not valid in all situations (e.g. if self is an r-value, you really just want to take it and stuff it in a local variable). So there is some magic that would need to go in here, but it does seem plausible that it could be done---the compiler should have enough information at this point to do the necessary transformation.

Is it sufficient to just preserve l-val-ness of self? This enables cases where self is mutable but not cases where self is evaluated multiple times. Note that even if self is only evaluated once, it can still evaluate to a closure whose body can be called multiple times; it's really only the expression that defines the self value which cannot be called multiple times (under the hypothetical semantics).

So the proposal would be:

  • Capture the value of self into a variable (in the compiler, so the implementer of __for doesn't have to worry about it), taking special care to preserve l-val-ness so that self can be mutable.
  • If the __for implementation needs to evaluate some aspect of self multiple times they can capture that in a closure or similar and call the closure multiple times.
  • For most other use cases it's expected that people who implement __for won't need to think about it, and they'll get single-execution semantics for self.
  • Strictly speaking, any use case that relies on multiple evaluation of self will break under this semantics.

Is there anything that would break outright, or be substantially more difficult or less intuitive to implement, under this semantics?

@OvermindDL1
Copy link

OvermindDL1 commented Jun 11, 2019

@OvermindDL1 It sounds like you want self to be essentially a closure? Is there any reason that can't be captured by e.g. a function pointer and a heap allocation? I was slightly confused by the AST value holder discussion---are you asking for a code block? That sounds like something beyond what the language would support today.

Ah, sorry, terminology from other languages I've carried over. ^.^;

I guess the most 'quick' way to example this would be using a lisp'y macro style since it is super simple and the most well known, thus given:

(defmacro my-macro (a b c)
  `(if ,a
    (let (_ (format t "This is truthy result: ~D~%" ,b)
      ,b)
    (let (_ (format t "This is false result: ~D~%" ,c
      ,c)
  )
)

Which in lua would be similar to:

function my_macro(a, b, c)
  "if (" .. a .. ") then\n" ..
  "  print(\"This is truthy result: \" .. (" .. b .. ")\n" ..
  "  return (" .. b .. ")" ..
  "else\n" ..
  "  print(\"This is false result: \" .. (" .. c .. ")\n" ..
  "  return (" .. c .. ")" ..
  "end\n"
end

Or if lua actually had string interpolation:

function my_macro(a, b, c)
  [[
  if ($a) then
    print("This is truthy result: " .. ($b))
    return ($b)
  else
    print("This is false result: " .. ($c))
    return ($c)
  end
]]
end

Which in terra would be wrapped as a macro instead:

local my_macro = macro(function(a, b, c) ... end)

Upon which if something of the above was executed like a macro, perhaps like:

my_macro(true, print("hello"), print("goodbye"))

Then it would print:

hello
This is a truthy result: nil
hello

Actually, wasn't there an example in the terra docs that described exactly how/why this works like this, checking....

Ah yep! At: http://terralang.org/getting-started.html#macros

local times2 = macro(function(a)
    -- a is a Terra quote
    return `a + `a
end)

terra doit()
    var a = times2(3)
    -- a == 6
end

Unlike a normal function, which works on Terra values, the arguments to Terra macros are passed to the macro as quotes.

Since macros take quotes rather than values, they have different behavior than function calls. For instance:

var c = 0
terra up()
    c = c + 1
    return c
end

terra doit()
    return times2(up()) --returns 1 + 2 == 3
end

The example returns 3 because up() is evaluated twice

Quite literally expanding a macro is like interpolating the arguments into position like via string interpolation

So the normal macro terminology of "AST Value" in terra would be a "Terra Quote", they represent the 'ast' of the expression passed in, and they are not the AST itself, and thus interpolating those into position in a quote, just like $ in perl or many other languages will interpolate the expression into the string, like "blah $s $s" or "blah ${"hello world"}", that is precisely what the ` is doing in terra, doing:

return `a + `a

From within a macro context is quite literally executing the expression after the ` (in this case just the variable 'a', which holds an AST Value / Terra Quote) and putting it into the position of the defined AST Value / Terra Quote that is currently being built, no different then referencing it in a list, like the above return could be something like this in pseudo-terra:

Call(Symbol("return"), Call(Symbol("+"), a, a))

The a's are literally bit inline in position, they are 'interpolated' into the AST Value / Terra quote. :-)

As for the It sounds like you want self to be essentially a closure? Is there any reason that can't be captured by e.g. a function pointer and a heap allocation? part, that all depends on the implementation, internally it might very well be, but you may not want it to be as that is a needless call indirection where a short expression put inline multiple times can be significantly cheaper than wrapping it up into another function call, especially if the function pointer is not statically known (like it was passed into a function using this, where passing in an ast value at compile-time makes it essentially free at that point).

Capture the value of self into a variable (in the compiler, so the implementer of __for doesn't have to worry about it), taking special care to preserve l-val-ness so that self can be mutable.

That makes it inconsistent with other macro's in terra however.

If the __for implementation needs to evaluate some aspect of self multiple times they can capture that in a closure or similar and call the closure multiple times.

That would be a needless cost however, requiring to allocate another function and pass in a pointer to it when it could be something as trivial as a logging expression that would be far better suited inline, especially if it wants to pull in line information.

For most other use cases it's expected that people who implement __for won't need to think about it, and they'll get single-execution semantics for self.

If __for is made to not follow normal macro semantics that exist everywhere else in terra, then it needs some way to override it, perhaps a __formacro to bring back the full normal macro context.

Strictly speaking, any use case that relies on multiple evaluation of self will break under this semantics.

This is the significant inconsistency. Not only this but if the for will compile differently based on the expression passed in then this will entirely break this, such as:

- Iterator style
for x in gen_iter(thing) do
 do_thing(x)
end

-- Filter pipeline style
for x in (x = gen_iter(thing), mod(x, 2) ~= 0, x = x / 2, x) do
  do_thing(x)
end

Which each essentially could compile to this lua code conditionally (though for llvm mapping you'd do it more efficiently, but meh), based on the expression passed in:

-- Iterator style
iter = gen_iter(thing)
while iter.valid() do
  x = iter.value()
  do_thing(x)
  iter.next()
end

-- Filter pipeline style
iter = gen_iter(thing)
iter.filter(function(x) return mod(x, 2) ~= 0 end)
iter.map(function(x) return x / 2 end)
iter.each(function(x) do_thing(x) end)

And yes this is almost exactly what some languages do, like Elixir (and via a different syntax with erlang, python, etc...) with "for comprehensions", they generate specialized code based on the expression that is passed in via a for macro. By having access to the raw expression ast itself and decomposing it then you can generate exceedingly specialized and optimized code. :-)

Is there anything that would break outright, or be substantially more difficult or less intuitive to implement, under this semantics?

So yes, being able to generate specialized code based on precisely the expression that is passed in would become fairly impossible.

@elliottslaughter
Copy link
Member

One thing to watch out for with Terra macros (particularly compared to Lisp) is that the macro arguments must parse as valid Terra code and (at least in the case of __for) must pass the type checker. Furthermore, self is forced to be an expression with a valid Terra type. Otherwise there would be no way for Terra to dispatch to the appropriate definition of __for. (I'm not 100% whether these restrictions apply to normal macros, I'd have to go and play with it to be sure.)

__for and other cases like __cast are in an odd position because not only do they not get wrapped in macro() they also do not necessarily behave like macros. This is for sure the case with __cast, and it is also potentially the case with __for (but I'd have to check to be sure). This is without discussing any semantic changes to the language.

The type checking requirement on __for's self argument makes the implementation of something like your filter pipeline style more challenging (though perhaps not impossible). Among other things x is not in scope at the point where that expression is being defined. Also any nested macros (e.g. if gen_iter is a macro) would get expanded first, so __for would end up holding the generated code. (Again, speaking without any semantic changes to the language.)

I do think comprehensions could be interested, but we'd be talking about a separate language extension in that case, which at least on the face of it seems like it could play well with the semantics I proposed above.

By the way, there is a fairly straightforward way in the existing Terra semantics to have code expand without resulting in any function calls. Methods on Terra objects can be defined by macros, so if you have an iterator API you can do something like:

(This code is super ugly and for demonstration purposes only. Don't do this at home! 😜)

local gen_iter = macro(function(acc, init, update, cond)
  local struct st {}
  st.methods.init = macro(function(self)
    return quote
      acc = init
    end
  end)
  st.methods.update = macro(function(self)
    return quote
      acc = update
    end
  end)
  st.metamethods.__for = function(self, body)
    return quote
      self:init()
      while cond do
        [body(acc)]
        self:update()
      end
    end
  end
  return `st {}
end)

local c = terralib.includec("stdio.h")

terra f()
  var x : int64
  for i in gen_iter(x, 0, x + 1, x < 10) do
    c.printf("i: %lld\n", i)
  end
end
f()

I think none of the above code would break under my proposed semantics; it would still print 0 to 9 even if the generator expression is cached in a variable.

@elliottslaughter
Copy link
Member

Actually, a perfect example of how __for is not a macro is right in that code sample: body is not a Terra quote, it's a Lua function that takes the loop variable as a parameter and returns a Terra quote.

@OvermindDL1
Copy link

One thing to watch out for with Terra macros (particularly compared to Lisp) is that the macro arguments must parse as valid Terra code and (at least in the case of __for) must pass the type checker. Furthermore, self is forced to be an expression with a valid Terra type. Otherwise there would be no way for Terra to dispatch to the appropriate definition of __for. (I'm not 100% whether these restrictions apply to normal macros, I'd have to go and play with it to be sure.)

This is pretty common on other languages, like typed-racket (a typed lisp'y style language) must eventually typecheck, or a couple of OCaml macro system that must remain lexically correct in its AST on the whole path.

Among other things x is not in scope at the point where that expression is being defined.

The internal x is local to within the expression only, this is a common requirement of this style in other languages.

Also any nested macros (e.g. if gen_iter is a macro) would get expanded first

I've noticed that, it's a bit odd. All other languages with macro systems that I can think of have the AST sent to the outer-most macro first, unexpanded, and it can of course call a function to expand a given AST Element, but when returned then any macro's remaining are then expanded, so this was a bit limiting for me in creating a DSEL a couple times. It really might be good to model other languages macro systems more as they are exceptionally well tested.

I do think comprehensions could be interested, but we'd be talking about a separate language extension in that case, which at least on the face of it seems like it could play well with the semantics I proposed above.

Built in comprehensions would be amazing to have! I have my own hacky version, but it works. ^.^;

I think none of the above code would break under my proposed semantics; it would still print 0 to 9 even if the generator expression is cached in a variable.

Yeah a proxy object would be a usual way to solve that.

Actually, a perfect example of how __for is not a macro is right in that code sample: body is not a Terra quote, it's a Lua function that takes the loop variable as a parameter and returns a Terra quote.

I have found that odd yeah, lol. It seems obvious to me that the purpose of that is to inject the hygenic binding in to it. There are other languages that use similar styles though, both the all common scheme supports such macro's as does Elixir, both via a function hoist or via AST scope transformation (AST scope transformation I always opt for on these, it keeps the AST introspectable while still allowing for being able to inject hygenic bindings).

@elliottslaughter
Copy link
Member

elliottslaughter commented Jun 12, 2019

This is the point where we really need to pick @zdevito's brain, since I was not around when these decisions were being made.

At any rate, we've strayed pretty far off topic at this point. If there is further discussion about macros in general let's split that into a separate issue.

@aiverson
Copy link
Contributor

aiverson commented Jun 12, 2019

One thing that has been a pain point for me has been how some metamethods are required to be macros and others are required not to be. I'd like to work to unify that if possible, or at least to document it.

Using a coroutine as the iterator as described previously is an incorrect usecase. The function call that produces the iterator wouldn't be the coroutine, the coroutine would be inside the iterator. The expression that evaluates to self would get called once, and it would produce an iterator with a coroutine in it. The coroutine would then be called multiple times inside the loop.

This change causes no loss of functionality in the for loop, it just makes a bug not happen by accident.
If something really needs to happen multiple times inside the loop, then the function can produce an iterator encapsulating that with no difficulties.

@aiverson
Copy link
Contributor

Also, I have a working implementation of this on my fork ready to be merged so you can test it out and play with it.

@aiverson
Copy link
Contributor

It looks like this might run into problems for mutable references of things it is looping over. The fix to this bug breaks a behavior depended on by an existing test, which may break code in the wild. I think that making this change is still the right move, but it may need to be delayed to a release and bundled with other changes.

Writing a macro method allows making the desired behavior on the existing test work, but given the case of the existing test, I'm seeing that additional sophistication in the loop expansion I wrote is necessary to avoid introducing an unnecessary copy in some cases.

@elliottslaughter
Copy link
Member

I do think supporting mutable references is important enough that it's worth trying to avoid breaking. How much work do you think it'll be? It seems hacky, but it may be sufficient to do a pattern match on the kind of AST being generated (since if I'm thinking clearly only certain kinds of ASTs can lead to mutable references).

@aiverson
Copy link
Contributor

The broken test in question is forlist2.t. It creates pointers into the thing it is looping over to allow the slot in the array to be mutated. The change I made, unconditionally caching the value, breaks this because it creates a copy of the array as the loop iteration variable which makes the pointers no longer point to the original array when they are mutated.

Inserting a check to determine whether the expression produced by the loop expression is an lvalue or an rvalue to determine whether to cache or enreference and cache for mutable reference purposes is not too difficult now that I know more about what I am doing. There is existing code for checking lvalue vs rvalue already, I would just need to use it, possibly with a minor bit of tidying.

However, it would be nice to not have to remember whether something is enreferenced in the specific iteration implementation provided by the iterable type. It would be convenient to just be able to name the variable directly for a read rather than needing to possibly dereference the loop variable. That change would be larger and might involve adding something akin to a C++ transparent reference to the language from scratch. I'm pretty sure I can avoid that by taking a cue from the CL setf macro to ensure the variables in the for loop are mutable when the loop iteration expression is an lvalue.

And I think we still need to add proper mutability/immutability to the language, which is another can of worms.

@elliottslaughter
Copy link
Member

It's been a while since we've discussed this issue. In order to help break the deadlock, I'd like to suggest that we accept #377 pending two tests that need to be fixed, forlist2.t, and a new one which tests mutability of the iterator itself, forlist4.t (which I have attached in #377 (comment)).

The rationale for merging this is:

  • The standard semantics for a for loop is that the bounds are evaluated before the execution of the loop and in fact even before the loop iterator variable is in scope. Even if the iterator itself is customized/customizable, users wouldn't expect it to modify the basic properties of the loop iteration itself.
  • Despite the name, __for is really the iterator metamethod. Therefore it should handle iteration, not evaluation of the loop bounds. It would be like if the if statement had an optional metamethod __bool which converted an object of arbitrary type into a boolean. For such a metamethod to do multiple evaluation would be really bizarre.
  • __for cannot be a true macro, because body is a function, not a quote. body has to be a function, because otherwise it would introduce unnecessary copying and make mutability of the iterand (the value referred to by the loop variable i) impossible.
  • If forlist4.t is fixed (and I think there is a reasonable path to doing this), the proposed mechanism will not interfere with mutability of the iterator itself.
  • At any rate, while there are things you could clean up in the language, it's not worth delaying valuable contributions based on some ideological ideal, especially given the amount of volunteer effort we actually have available to do such things. Assuming the two issues above are fixed, PR isolating changes for the for-in variable synthesis #377 seems like a clear win to me.
  • As for some of the more creative things discussed above, if you really want to change the semantics of for it really seems like a better approach is to introduce novel syntax. We could e.g. allow the use of language extensions inside Terra, as long as such extensions return a Terra quote as their result. But that would be a separate issue from what we're discussing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants