Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework of parsing the header of a replay #143

Merged
merged 26 commits into from
May 18, 2024
Merged

Conversation

Garanas
Copy link
Member

@Garanas Garanas commented May 7, 2024

In this pull request we finalize the new setup to process replays. As described in #140 , this is the final step towards separating the various phases of the parsing. With these changes we separate the following phases for the body of the replay:

  • (1) Token phase
  • (2) Parsing phase
  • (3) Semantics (interpretation) phase

New pull requests can now create additional semantics without being troubled by the token and parsing phases.

@Garanas Garanas changed the title Rework parsing the header of a replay Rework of parsing the header of a replay May 7, 2024
@Garanas
Copy link
Member Author

Garanas commented May 7, 2024

@Sheikah45 How would I verify various properties such as the token length being equal to the event length when those properties are not exposed?

@Sheikah45
Copy link
Member

Sheikah45 commented May 7, 2024

I would ask if the bytes aren't exposed why do they need to be verified by a test at all?

Wouldn't it be equivalent to just check the events directly?

And in a test you know the tokens should be equal to the events so you just need to assert the number of events is as expected for a replay file.

Also as a side note, I don't think the records that wrap a single class are necessary, unless you have plans to extend them later on but from what I can see that doesn't seem to be the case.

@Garanas
Copy link
Member Author

Garanas commented May 7, 2024

Also as a side note, I don't think the records that wrap a single class are necessary, unless you have plans to extend them later on but from what I can see that doesn't seem to be the case.

Are you referring to a record such as GameOption? What would you replace them with?

@Sheikah45
Copy link
Member

I meant ReplayBody sorry for not clarifying

@Sheikah45
Copy link
Member

Also as a side note we don't generally use annotations such as @ Contract since the upkeep of those isn't necessarily worth the benefit from the ide analysis for our repos

@Garanas
Copy link
Member Author

Garanas commented May 8, 2024

I meant ReplayBody sorry for not clarifying

I understand, it's primarily to make it conceptually clear. I'll remove it again

@Garanas
Copy link
Member Author

Garanas commented May 8, 2024

I would ask if the bytes aren't exposed why do they need to be verified by a test at all?

To verify everything is correct from the bottom up. Instead, I've introduced conditions that throw an exception when things are off during parsing. That also ironically forced me to fix two bugs, so that's nice 😃

@Garanas
Copy link
Member Author

Garanas commented May 8, 2024

I'd like to add some more replays (5 from co-op, 5 from faf, 5 from fafdevelop, etc) to more solidify the current setup. I'll also test other properties such as the number of events of a given type, to catch regression if more changes are made in the future.

Would that test the feature sufficient?

@Sheikah45
Copy link
Member

Why do we need 5 of each replay?

That seems like a bit much given they don't contain very different structures

@Sheikah45
Copy link
Member

I would normally think even just one test for parsing is sufficient given it is not a stochastic process and they don't contain unknown events

Comment on lines 181 to 182
int sizeGameOptionsInBytes = dataStream.readInt();
this.gameOptions = ((Map<String, Object>) parseLua(dataStream)).entrySet().stream().filter(entry -> "Options".equals(entry.getKey())).flatMap(entry -> ((Map<String, Object>) entry.getValue()).entrySet().stream()).map(entry -> new GameOption(entry.getKey(), entry.getValue())).collect(Collectors.toList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave this wrapped as it is more readable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the formatter!! 😛

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in:

Comment on lines 21 to 73
public enum AutoTeams {
none("None"),
manual("Manual"),
tvsb("Top versus bottom"),
lvsr("Left versus right"),
pvsi("Even versus uneven");

private final String string;

AutoTeams(String string) {
this.string = string;
}
}

public enum TeamLock {
locked("Locked"), unlocked("Unlocked");

private final String string;

TeamLock(String string) {
this.string = string;
}
}

public enum TeamSpawn {
fixed("Fixed"),
random("Random"),
balanced("Balanced"),
balanced_flex("Flexible balanced"),
random_reveal("Random and revealed"),
balanced_reveal("Balanced and revealed"),
balanced_reveal_mirrored("Mirror balanced and revealed"),
balanced_flex_reveal("Flexible balanced and revealed");

private final String string;

TeamSpawn(String string) {
this.string = string;
}
}

public enum Victory {
demoralization("Assasination"),
domination("Supremacy"),
eradication("Annihilation"),
sandbox("Sandbox");

private final String string;

Victory(String string) {
this.string = string;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java enums should be SCREAMING_SNAKE_CASE and AutoTeams can be the full name for the enum

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the strings don't have getters so aren't accessible outside the enum

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would I then match the game options with the enums? At the moment they are demoralization, domination, etc which I think I can immediate map one-to-one to an enum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would add a findByValue method that takes in a string and returns the enum that matches the string you define as a member of the enum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that the enum names should be normalized

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in cce2ee3

@Garanas
Copy link
Member Author

Garanas commented May 9, 2024

I would normally think even just one test for parsing is sufficient given it is not a stochastic process and they don't contain unknown events

Not all replays have all information in them however. I think I've covered almost all of the cases at this moment

@Garanas
Copy link
Member Author

Garanas commented May 9, 2024

With d0e9566 we introduce the first semantics: applying the tick and the client to an event! It also casually shrinks the number of events of a large replay from ~130k to ~15k, which makes it much more feasible to process too 😃

Comment on lines 20 to 21
final int[] tick = {-1};
final int[] clientId = {-1};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just use an AtomicInteger here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in 0f0aca2

Comment on lines 52 to 56
case Event.LuaSimCallback(
String func, LuaTable parametersLua, Event.CommandUnits commandUnits
) when func.equals("GiveResourcesToPlayer") -> {
yield switch (parametersLua) {
case LuaTable.Table callbackTable -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the internal switch. If you type parametersLua as LuaTable.Table the case will only match when that is the case

Suggested change
case Event.LuaSimCallback(
String func, LuaTable parametersLua, Event.CommandUnits commandUnits
) when func.equals("GiveResourcesToPlayer") -> {
yield switch (parametersLua) {
case LuaTable.Table callbackTable -> {
case Event.LuaSimCallback(
String func, LuaTable.Table callbackTable, Event.CommandUnits commandUnits
) when func.equals("GiveResourcesToPlayer") -> {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And since this is only one case label we can just use an if (!( instanceof))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the internal switch in 7d4ef66

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to rewrite the instance check and also keep the when statement, or are those two separate if statements?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would just change it from when to a normal && condition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep the current format. It makes it consistent with the other semantic-related functions. If you want me to change it anyway then I'll do that; but I think consistency is nice.

@Garanas Garanas marked this pull request as ready for review May 11, 2024 07:53
@Garanas Garanas requested review from Sheikah45 and Brutus5000 May 11, 2024 07:56
@Garanas
Copy link
Member Author

Garanas commented May 11, 2024

This pull request is ready for feedback - functionally it is where I'd want it to be and further semantics can be introduced as separate pull requests.

I'll only focus on extending the test suite and processing the feedback.

Comment on lines 266 to 268
if(stream.available() > 0) {
throw new EOFException();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a little confusing use of exception. typically EOFException means we hit the end of the file when we are expecting more data. Here we throw an exception if there is data left over.

I think probably just an illegal state exception with appropriate message is best

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is semantically better, I was searching for a proper exception when writing the code. Processed it in:

Comment on lines 21 to 73
public enum AutoTeams {
none("None"),
manual("Manual"),
tvsb("Top versus bottom"),
lvsr("Left versus right"),
pvsi("Even versus uneven");

private final String string;

AutoTeams(String string) {
this.string = string;
}
}

public enum TeamLock {
locked("Locked"), unlocked("Unlocked");

private final String string;

TeamLock(String string) {
this.string = string;
}
}

public enum TeamSpawn {
fixed("Fixed"),
random("Random"),
balanced("Balanced"),
balanced_flex("Flexible balanced"),
random_reveal("Random and revealed"),
balanced_reveal("Balanced and revealed"),
balanced_reveal_mirrored("Mirror balanced and revealed"),
balanced_flex_reveal("Flexible balanced and revealed");

private final String string;

TeamSpawn(String string) {
this.string = string;
}
}

public enum Victory {
demoralization("Assasination"),
domination("Supremacy"),
eradication("Annihilation"),
sandbox("Sandbox");

private final String string;

Victory(String string) {
this.string = string;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that the enum names should be normalized

Comment on lines 48 to 53





return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why all the whitespace? And why return null instead of an empty list?

Copy link
Member Author

@Garanas Garanas May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not implemented yet. I intend to implement it in a future pull request. I did remove the whitespace and document it in comments:

Comment on lines 61 to 64




Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same whitespace question

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 73 to 76




Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same whitespace question

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 5 to 10
public record ReplayHeaderToken(String gameVersion, String replayVersion, String pathToScenario, boolean cheatsEnabled, int seed,
List<Source> sources,
byte[] mods,
byte[] gameOptions,
List<byte[]> playerOptions) {
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you decide to have the header be represented by this more raw value and a more complete value in ReplayHeader? I don't know if there is the same value to having both of these that there is for the body tokens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little bit at a loss here. The header is not as nicely structured as the body of a replay. I did find a few cases where the header has a leading 'size' integer and I interpret those as 'I'll process whatever is in there during parsing'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is probably better to just parse all the fields directly rather than having two objects since there isn't exactly the same issue with new tokens and the size notation.

Copy link
Member Author

@Garanas Garanas May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would forfeit the separation of phases; should I then just process/parse everything in the ReplayHeaderParser class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I would think so. Since it seems like the header is really more like a single set of bytes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in 76615a8

* @param name
* @param sourceId
*/
public record Source(int sourceId, int PlayerId, String name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

playerId with proper casing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in:

import java.util.HashMap;
import java.util.Map;

public class Utils {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably rename to LoadUtils or ParseUtils

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in 020955d

Comment on lines 26 to 41
final AtomicInteger tick = new AtomicInteger(0);
final AtomicInteger clientId = new AtomicInteger(-1);

return events.stream().map((event) -> switch (event) {
case Event.Advance e -> {
tick.addAndGet(e.ticksToAdvance());
yield null;
}

case Event.SetCommandSource e -> {
clientId.set(e.playerIndex());
yield null;
}

default -> new RegisteredEvent(tick.intValue(), sources.get(clientId.intValue()), event);
}).filter(Objects::nonNull).toList();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this now I might just change this to a normal enhanced for loop that modifies a list since the lambda body modifies the external state so much. Then just a normal int can be used. But I don't know if it is totally required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is nice to be consistent with other solutions that we use to map over the events.

@Sheikah45
Copy link
Member

@Brutus5000 don't want to leave you out this time XD

Comment on lines 30 to 38
case Event.Advance e -> {
tick.addAndGet(e.ticksToAdvance());
yield null;
}

case Event.SetCommandSource e -> {
clientId.set(e.playerIndex());
yield null;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the full pattern matching form here so Event.Advance(int ticksToAdvance)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in 706ad77

Comment on lines 73 to 80
fixed("fixed", "Fixed"),
random("random", "Random"),
balanced("balanced", "Balanced"),
balanced_flex("balanced_flex", "Flexible balanced"),
random_reveal("random_reveal", "Random and revealed"),
balanced_reveal("balanced_reveal", "Balanced and revealed"),
balanced_reveal_mirrored("balanced_reveal_mirrored", "Mirror balanced and revealed"),
balanced_flex_reveal("balanced_flex_reveal", "Flexible balanced and revealed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These enums should be properly cased as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in: bdee610

@Garanas Garanas requested a review from Sheikah45 May 17, 2024 11:20
Copy link
Member

@Sheikah45 Sheikah45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment that could or could not be done really

Comment on lines 61 to 94
if (!(callbackTable.value().get("From") instanceof LuaData.Number from)) {
yield null;
}

// focus army starts is 1-based instead of 0-based, to align it we subtract 1
if (from.value() - 1 <= -2) {
yield null;
}

// TODO: this field has no meaning and can be manipulated, instead use the authorised command source.
// Requires refactoring in the game!
if (!(callbackTable.value().get("Sender") instanceof LuaData.String sender)) {
yield null;
}

// TODO: apparently all players create a sim callback that contains the chat message. This hack is how we skip it,
// Requires refactoring in the game!
if (!Objects.equals(sender.value(), registeredEvent.source().name())) {
yield null;
}

if (!(callbackTable.value().get("Msg") instanceof LuaData.Table msgTable)) {
yield null;
}

// TODO: this is 1 out of the 2 legitimate fields
if (!(msgTable.value().get("to") instanceof LuaData.String msgTo)) {
yield null;
}

// TODO: this is 2 out of the 2 legitimate fields
if (!(msgTable.value().get("text") instanceof LuaData.String msgText)) {
yield null;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically for these you could use the full record pattern. So LuaData.Number(float value)

Just to remove all the calls of .value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processed in 1b9492f

@Sheikah45 Sheikah45 merged commit 6e9ba4d into develop May 18, 2024
1 check passed
@Sheikah45 Sheikah45 deleted the rework/replay-header branch May 18, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants