Skip to content

Behaviour of Url::join #333

Closed
Closed
@theduke

Description

@theduke

Url::join has a very confusing name.
I would expect it to append a path segment.
Instead, it replaces the whole base path, which is quite awkward.

If you want to actually join, you have to:

let segments = url.path_segments_mut().unwrap()pop_if_empty().push("new-segment");

Personally I think join should append a new segment, but changing this might be too late...
If so, the time is now, before a 1.0.

Otherwise, I would suggest adding a new method, something like add_segment.

Activity

dtolnay

dtolnay commented on May 12, 2017

@dtolnay
Contributor

From the example code in the documentation, it looks like it appends a path segment as you expected.

extern crate url;

use url::Url;

fn main() {
    let url = Url::parse("https://example.net/foo/").unwrap();
    let url = url.join("bar").unwrap();
    assert_eq!(url.as_str(), "https://example.net/foo/bar");
}

Could you share the input you were using and the output you were expecting?

theduke

theduke commented on May 12, 2017

@theduke
Author

f the joined path does not start with a slash, it replaces the last path segment which works fine if the url ends with a slash, but works unexpectedly (at least for me) when it does not.

If the joined path does start with a slash, it replaces the entire path, regardless of whether the url ends with one.

let u = url::Url::parse("http://domain.com/a/b").unwrap();
  let u2 = u.join("c").unwrap();
  println!("{}", u2);
  
  let u3 = u.join("d/e").unwrap();
  println!("{}", u3);
  
  let u3 = u.join("/f").unwrap();
  println!("{}", u3);

Output:

http://domain.com/a/c
http://domain.com/a/d/e
http://domain.com/f
theduke

theduke commented on May 12, 2017

@theduke
Author

As a comparison, this is the behaviour when joining with PahtBuf:

Joining a path starting with a slash also replaces the whole path, but otherwise it appends a path segment and does not replace the last one.

  let p = PathBuf::from("/a/b");
  let p2 = p.join("c");
  println!("{}", p2.to_str().unwrap());
  
  let p3 = p.join("d/e");
  println!("{}", p3.to_str().unwrap());
  
  let p4 = p.join("/f");
  println!("{}", p4.to_str().unwrap());
/a/b/c
/a/b/d/e
/f
dtolnay

dtolnay commented on May 12, 2017

@dtolnay
Contributor

Makes sense. I agree that is confusing.

theduke

theduke commented on May 12, 2017

@theduke
Author

I'd also expect .join("/b") to append /b, not to replace the whole path.

Path in std behaves this way as well, though.

jongiddy

jongiddy commented on May 12, 2017

@jongiddy

Evidently, for both PathBuf and Url, join means "return parameter 2 as a non-contextual (absolute) path, given the context named by parameter 1". It might have better been called move_to. For a path name, that is the chdir behavior. For a URL, it is the HTML follow-link behavior.

Given this similar behavior of join for both paths and URL's, I'd say keep it as-is and choose a new name for a method that actually appends at all times. Since the word "append" appears several times in the descriptions of the expected behavior, that seems the right name.

changed the title [-]Url::join: Confusing name[/-] [+]Behaviour of Url::join[/+] on May 12, 2017
SimonSapin

SimonSapin commented on May 12, 2017

@SimonSapin
Member

For what it’s worth some_base_url.join(some_string) is the same as Url::options().base_url(&some_base_url).parse(some_string), which in HTML is the same as resolving the link in <a href="{some_string}">…</a><base href="{some_base_url.as_str()}">.

For example, foo.html with base http://example.net/bar/baz.html resolves to http://example.net/bar/foo.html, not http://example.net/bar/baz.html/foo.html.

Given backward-compatibility constraints, either or both of these are acceptable:

  • Add a new method with a different name and a different behavior.
  • Add a new method with a different name and the same behavior as join, and add #[deprecated] to join.

Changing the behavior of join would be a breaking change. Even in a future 2.0 version of url it doesn’t seem like a good idea because existing callers expect the current behavior.

theduke

theduke commented on May 12, 2017

@theduke
Author

Yeah I agree, hence my but changing this might be too late... comment.

What would be a good name for a new method?

  • append
  • path_append
  • append_path
SimonSapin

SimonSapin commented on May 12, 2017

@SimonSapin
Member

Note that join is not just about the path. In pseudo-code, "https://a/b".join("//c") is https://c/, for example. "http://d/e?f#g".join("?h") is http://d/e?h. How do you want to handle these cases in this new method?

If you only want to manipulate the path without touching other components, path_segments_mut is already there.

nox

nox commented on Jul 18, 2019

@nox
Contributor

Nobody suggested a better alternative name, so I'm closing this.

theduke

theduke commented on Jul 18, 2019

@theduke
Author

My suggestion would be renaming join to merge, which expresses the behaviour a bit better.

But ,due to the popularity of url, that is probably unwise at this point.

23 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @nox@ajbonner@theduke@SimonSapin@DanielJoyce

        Issue actions

          Behaviour of Url::join · Issue #333 · servo/rust-url