Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

v1.5.0

Compare
Choose a tag to compare
@timcharper timcharper released this 12 Sep 15:03
· 1207 commits to master since this release

Changes from 1.4.x to 1.5.0

Recommended Mesos version is 1.3.0

Breaking Changes

Packaging standardized

We now publish more normalized packages that attempt to follow Linux Standard Base Guidelines and use sbt-native-packager to achieve this.
As a result of this and the many historic ways of passing options into marathon, we will only read /etc/default/marathon when starting up.
This file, like /etc/sysconfig/marathon, has all marathon command line options as "MARATHON_XXX=YYY" which will translate to --xx=yyy.
We no longer support /etc/marathon/conf which was a set of files that would get translated into command line arguments. In addition,
we no longer assume that if there is no zk/master argument passed in, then both are running on localhost.

If support for any of the above is important to you, please file a JIRA and/or create a PR/Patch.

App JSON Fields Changed or Moved.

Marathon will continue to accept the app JSON as it did in 1.4;
however, applications that use deprecated fields will be normalized into a canonical representation.
The app JSON generated by the /v2 REST API has changed: only canonical fields are generated.
The App RAML specification is the source of truth with respect to deprecated fields.
The following deprecated fields will no longer be generated for app JSON:

  • ipAddress
  • container.docker.portMappings
  • container.docker.network
  • ports
  • uris

Marathon clients that consume these deprecated fields will require changes.
In addition, new networking API fields have been introduced:

  • networks
  • container.portMappings

The networks field replaces the ipAddress.networkName and container.docker.network fields, and supports joining an app to multiple container networks.
The legacy IP/CT API did not require a resolvable network name in order to use a container network;
it allowed both an app definition to leave ipAddress.networkName unspecified and the operator to leave --default_network_name unspecified.
Starting with Marathon v1.5 such apps will be rejected: apps may leave networks[x].name unspecified for container networks only if --default_network_name has been specified by the operator.
Marathon injects the value of --default_network_name into unnamed container networks upon app create/update.

Upgrading from Marathon 1.4.x to Marathon 1.5.x will automatically migrate existing applications to the new networking API.
Migration of legacy Mesos IP/CT apps may fail if those apps did not specify ipAddress.networkName and there is no default network name specified.
See the (networking documentation)[docs/docs/networking.md] for details concerning app migration and network API changes.

The old app networking docs have been relocated.
See the networking documentation for details concerning the new API.

Metric Names Changed or Moved.

We moved to a different Metrics library and the metrics are not always compatible or the same as existing metrics;
however, the metrics are also now more accurate, use less memory, and are expected to get better throughout the release.
Where it was possible, we maintained the original metric names/groupings/etc, but some are in new locations or have
slightly different semantics. Any monitoring dashboards should be updated.

Before 1.5.0 releases, we will publish a migration guide for the new metric formats and where the replacement
metrics can be found and the formats they are now in.

Artifact store has been removed

The artifact store was deprecated with Marathon 1.4 and is removed in version.
The command line flag --artifact_store will throw an error if specified.
The REST API endpoint /v2/artifacts has been removed completely.

Logging endpoint

Marathon has the ability to view and change log level configuration during runtime via the /logging endpoint.
This version switches from a form based API to a JSON based API, while maintaining the functionality.
We also secured this endpoint, so you can restrict who is allowed to view or update this configuration.
Please find our API documentation for all details.

Event Subscribers has been removed.

The events subscribers endpoint (/v2/eventSubscribers) was deprecated in Marathon 1.4 and is removed in this version.
Please move to the /v2/events endpoint instead.

Removed command line parameters

  • The command line flag max_tasks_per_offer has been deprecated since 1.4 and is removed now. Please use max_instances_per_offer.

Deprecated command line parameters

  • The command line flag save_tasks_to_launch_timeout is deprecated and has no effect any longer.

Overview

Networking Improvements Involving Multiple Container Networks

The field networkNames has been added to app container's ContainerPortMapping and pod's Endpoint. Using the field, an app or pod participating in multiple container networks can now forward ports by specifying a single item networkNames. For more information, see the networking documentation.

Additionally container port discovery has been improved, with a pod or app being able specify with which container network(s) a port name/protocol/etc is associated. Discovery labels are now generated for container networks associated with ports.

Mesos Bridge Network Name Configurable

The CNI network used for Mesos containers when bridge networking is now configurable via the command-line argument --mesos_bridge_name. As with other command-line-args, this can also be specified via MARATHON_MESOS_BRIDGE_NAME, as well.

Backup and Restore Operations

You can now backup and restore Marathon's internal state via the DELETE /v2/leader API endpoint.

See MARATHON-7041

TTY support

You can now specify that a TTY should be allocated for app or pod containers. See the TTY definition. An example can be found in v2/examples/app.json.

See MARATHON-7062

Improved Validation Error Messages

All validation specified in the RAML is now programatically enforced, leading to more consistent, descriptive, and legible error messages.

Security improvements

Marathon is in better compliance with various security best-practices. An example of this is that Marathon no longer responds to the directory listing request.

File-based secrets

Marathon has a pluggable interface for secret store providers.
Previous versions of Marathon allowed secrets to be passed as environment variables.
With this version it is also possible to provide secrets as volumes, mounted under a specified path.
See file based secret documentation

Changes around unreachableStrategy

Recent changes in Apache Mesos introduced the ability to handle intermittent connectivity to an agent which may be running a Marathon task. This change introduced the TASK_UNREACHABLE. This allows for the ability for a node to disconnect and reconnect to the cluster without having a task replaced. This resulted in (based on default configurations) of a delay of 75 seconds before Marathon would be notified by Mesos to replace the task. The previous behavior of Marathon was usually sub-second replacement of a lost task.

It is now possible to configure unreachableStrategy for apps and pods to instantly replace unreachable apps or pods. To enable this behavior, you need to configure your app or pod as shown below:

{
  ...
  "unreachableStrategy": {
    "inactiveAfterSeconds": 0,
    "expungeAfterSeconds": 0
  },
  ...
}

Note: Instantly means as soon as marathon becomes aware of the unreachable task. By default, Marathon is notified after 75 seconds by Mesos
that an agent is disconnected. You can change this duration in Mesos by configuring agent_ping_timeout and max_agent_ping_timeouts.

Migrating unreachableStrategy

If you want all of your apps and pods to adopt a UnreachableStrategy that retains the previous behavior where instance were immediately replaced so that you does not have to update every single app definition.

To change the unreachableStrategy of all apps and pods, set the environment variable MIGRATION_1_4_6_UNREACHABLE_STRATEGY to true, which leads to the following behavior during migration:

When opting in to the unreachable migration step

  1. all app and pod definitions that had a config of UnreachableStrategy(300 seconds, 600 seconds) (previous default) are migrated to have UnreachableStrategy(0 seconds, 0 seconds)
  2. all app and pod definitions that had a config of UnreachableStrategy(1 second, x seconds) are migrated to have UnreachableStrategy(0 seconds, x seconds)
  3. all app and pod definitions that had a config of UnreachableStrategy(1 second, 2 seconds) are migrated to have UnreachableStrategy(0 seconds, 0 seconds)

Note: If you set this variable after upgrading to 1.4.6, it will have no effect. Also, the UnreachableStrategy default has not been changed, so in order for apps and pods created in the future to have the replace-instantly behavior, unreachableStrategy's inactiveAfterSeconds and expungeAfterSeconds must be set to 0 as seen in the JSON above.

Fixed issues