Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify Sequence Type Relationships #206

Open
mahmoudimus opened this issue Aug 31, 2021 · 3 comments
Open

Clarify Sequence Type Relationships #206

mahmoudimus opened this issue Aug 31, 2021 · 3 comments

Comments

@mahmoudimus
Copy link

mahmoudimus commented Aug 31, 2021

In the spec, under Sequence types, it states:

Sequence: a sequence of known length lets us know how many elements it contains without processing them. Examples: dict, list, tuple, but not string or bytes.

However, in the Java implementation, Dict does not implement the Sequence interface. I think this is a mistake in the spec and should be removed:

-Sequence: a sequence of known length lets us know how many elements it contains without processing them. Examples: dict, list, tuple, but not string or bytes.
+Sequence: a sequence of known length lets us know how many elements it contains without processing them. Examples: list, tuple, but not string or bytes.

This exposes a few other questions that the spec mentions:

  • SetIndexable currently does not exist. The Java implementation relies on EvalUtils#setIndex but is hardcoded to only Dict and List. Should there be a StarlarkSetIndexable interface or should there be a patch to the existing StarlarkIndexable, similar to Evalutils#setField? For example:
+  /**
+   * Updates an object as if by the Starlark statement {@code object[key] = value}.
+   *
+   * @throws EvalException if underlying object is immutable.
+   */
+  void setIndex(StarlarkSemantics semantics, Object key, Object value) throws EvalException;
package net.starlark.java.eval;

import java.util.Map;

public interface StarlarkMapping<K, V> extends Map<K, V>, StarlarkSetIndexable, StarlarkIterable<K> {

}

This would also require a change in Starlark#len().

  • Last, but not least, in the Java implementation Sequence extends the java.util.List interface. Are all sequences a sub-type of List? Did we mean Collection ?

Interesting to hear your thoughts on this.

Thanks.

@ndmitchell
Copy link
Contributor

I think these remarks are all very specific to the Java Starlark, and more than that, the internal implementation of Java Starlark., Whereas this repo is only about the things that can be observed by the source Starlark code. For example there isn't a Sequence interface in Starlark, it's just a bunch of types that can be iterated over. Are you saying that in Java Starlark [x for x in {1:1, 2:2}] doesn't work? Similarly for all the other types, are you saying that the Java Starlark implementation doesn't correctly at the Starlark level implement the spec? Or are you saying that it does implement the spec but the spec should change? If it's merely that the Java implementation should change without changing the Starlark visible behaviour, that's something for a different repo.

@mahmoudimus
Copy link
Author

mahmoudimus commented Sep 8, 2021

Hi @ndmitchell thanks for the response.

I think these remarks are all very specific to the Java Starlark, and more than that, the internal implementation of Java Starlark.,

That's a fair point, I'll clarify this further below.

Whereas this repo is only about the things that can be observed by the source Starlark code. For example there isn't a Sequence interface in Starlark, it's just a bunch of types that can be iterated over. Are you saying that in Java Starlark [x for x in {1:1, 2:2}] doesn't work?

While there isn't a Sequence interface, there are a few different Sequence types, also referred to in the spec as "contracts." I think I may be just equating "contract" to mean "interface," but perfectly happy to use "contracts" over interface.

In your example of [x for x in {1:1, 2:2}], that would be of an iterable contract and that's exactly the kind of ambiguity I'm trying to resolve. For example, if the difference between an Iterable and a Sequence is that a sequence is of known length and an Iterable is of unknown length, that makes sense.

Similarly for all the other types, are you saying that the Java Starlark implementation doesn't correctly at the Starlark level implement the spec? Or are you saying that it does implement the spec but the spec should change?

So this is where it is a bit ambiguous and I'm asking if there needs to be clarification. Take for example the semantics behind a slice expression. What contracts must a particular type satisfy for a slice expression to work? The spec says that they work on an "indexable sequence". Does that apply to all types that satisfy an "indexable" and a "sequence" contract?

Python's collections.abc has a nice table that helps highlight all the operations available to a particular contract. I'm trying to understand if it makes sense to propose (and that's really the purpose of this Github issue) that the Starlark spec follows similar naming conventions and map the appropriate operations to the corresponding Starlark types.

So, something like:

  • List: Satisfies Iterable, Indexable, SetIndexable, Sequence
  • etc. etc.

If it's merely that the Java implementation should change without changing the Starlark visible behaviour, that's something for a different repo.

Of course, I think I was just trying to start a conversation to centralize the discussion before suggesting anything on the bazel repo itself. If it still is too Java implementation specific, I'm happy to close this issue and move it to that repo instead.

@ndmitchell
Copy link
Contributor

Starlark only has three compound types - lists, tuples and dictionaries, with strings being a kinda-compound type. I think it's probably easier to remove all ambiguity from the spec and just say "slices work for lists, tuples and strings", "indexing works for lists, tuples, strings and dictionaries", "iteration works for lists, tuples and dictionaries". Given there are only 4 types, it's not too much to type and is 100% explicit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants