Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

escaping of reserved characters #18

Open
gdm85 opened this issue Dec 15, 2015 · 12 comments
Open

escaping of reserved characters #18

gdm85 opened this issue Dec 15, 2015 · 12 comments

Comments

@gdm85
Copy link

gdm85 commented Dec 15, 2015

If you use '%' in any string field, the character will not be correctly escaped when passed via the json GET querystring parameter.

This is the diff I used to narrow down the bug:

diff --git a/myservice-letmegrpc/myservice.letmegrpc.go b/myservice-letmegrpc/myservice.letmegrpc.go
index 389156e..9f8b747 100644
--- a/myservice-letmegrpc/myservice.letmegrpc.go
+++ b/myservice-letmegrpc/myservice.letmegrpc.go
@@ -35,7 +35,10 @@ It has these top-level messages:
 */
 package myservice

-import net_http "net/http"
+import (
+       net_http "net/http"
+       "net/url"
+)
 import encoding_json "encoding/json"
 import io "io"
 import golang_org_x_net_context "golang.org/x/net/context"
@@ -2437,7 +2440,17 @@ s += '<div class="children" type="Something_something">' + buildSomething_som

 func (this *htmlmyservice) CreateSomething(w net_http.ResponseWriter, req *net_http.Request) {
        w.Write([]byte(Header(`myservice`, `CreateSomething`)))
-       jsonString := req.FormValue("json")
+
+       rawQuery := req.URL.RawQuery[len("json="):]
+       fmt.Println("raw query =", rawQuery)
+
+       //      jsonString := req.FormValue("json")
+       jsonString, err := url.QueryUnescape(rawQuery)
+       if err != nil {
+               w.Write([]byte("<div class=\"alert alert-danger\" role=\"alert\">" + err.Error() + "</div>"))
+               return
+       }
+
        someValue := false
        msg := &CreateSomethingRequest{}
        if len(jsonString) > 0 {
@@ -2450,6 +2463,8 @@ func (this *htmlmyservice) CreateSomething(w net_http.ResponseWriter, req
                        w.Write([]byte("<div class=\"alert alert-danger\" role=\"alert\">" + err.Error() + "</div>"))
                }
                someValue = true
+       } else {
+               w.Write([]byte("<div class=\"alert alert-info\" role=\"alert\">" + `No JSON sent` + "</div>"))
        }
        w.Write([]byte(Formmyservice_CreateSomething))
        if someValue {
@awalterschulze
Copy link
Member

There is definitely a problem here.
I am investigating further about how to fix this.

@awalterschulze
Copy link
Member

This should fix the problem
d9e70ab

@gdm85
Copy link
Author

gdm85 commented Dec 17, 2015

@awalterschulze yes I can see that will fix the issue, but wouldn't be best to simply perform URL encoding when the json= querystring is composed?

@awalterschulze
Copy link
Member

Sorry I am really a noob web programmer.

Wouldn't it make the rest of the url less readable/hackable?

@awalterschulze
Copy link
Member

I don't want this

{"bla":1}

To reslut in

%7B%22bla%22%3A1%7D

@gdm85
Copy link
Author

gdm85 commented Dec 18, 2015

@awalterschulze well, if you are missing other sequences that need escaping, it will simply break.

I suggest to use the correct JavaScript function encodeURIComponent and then if you want to have some characters encoded by the browser (that's what happens when you type them non-encoded, and when it doesn't happen it just breaks server-side as per OP) you can replace after the call to the proper encoding function. The list of reserved characters is here for example: https://en.wikipedia.org/wiki/Percent-encoding

@awalterschulze
Copy link
Member

I am going on holiday, but I'll be back in about a week, then I'll look at this again.
Thank you for the help.

@awalterschulze awalterschulze changed the title Incorrect escaping of '%' characters Incorrect escaping of reserved characters Dec 19, 2015
@awalterschulze awalterschulze changed the title Incorrect escaping of reserved characters escaping of reserved characters Dec 19, 2015
@awalterschulze
Copy link
Member

Yes if there are other sequences that require escaping it will break :(

I am struggling to understand the second paragraph.

The wikipedia page says that encodeURIComponent is a non-standard function, or am I reading it wrong?
Maybe there is a better js function?

Reserved characters that don't cause problems

! $ ( ) * + , / : ? @ [ ]

Reserved characters that cause problems

# ; =

Reserved characters that cause problems that have been fixed

& '

Percent is also a problem, but has now been fixed

%

But I haven't taken into account character sequences.
Do you have any examples?

@gdm85
Copy link
Author

gdm85 commented Feb 22, 2016

@awalterschulze sorry, this issue went out of my radar.

The wikipedia page says

Non-standard implementations
There exists a non-standard encoding for Unicode characters: %uxxxx, where xxxx is a UTF-16 code unit represented as four hexadecimal digits. This behavior is not specified by any RFC and has been rejected by the W3C. The third edition of ECMA-262 still includes an escape function that uses this syntax, along with encodeURI and encodeURIComponent functions, which apply UTF-8 encoding to a string, then percent-escape the resulting bytes.

I think you're reading it wrong :) It's referring to unicode not specifically to the encodeURIComponent function which is standard since ECMA3, a summary is here: https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent

What is non-standard is not encoding some characters, because browsers might vary in the way of handling it. But let's say that letmegrpc is only supposed to be used with a few browsers that do support our hack with some characters, then what I proposed before in that 2nd paragraph was:

  1. use encodeURIComponent on all input from user
  2. do a string replace for some of the characters you want instead the browser to take care of e.g. %24%20%2B%20%2C%20%2F%20%3A%20%3F%20%40%20%5B%20%5D%20%7B%20%7D (which is the encoding for $ + , / : ? @ [ ] { })

What I mean in (2) is that you have some decoding with a simple string replacement (it's safe to do on an encoded string). Example snippet:

function encodePartially(userInput) {
    var encoded = encodeURIComponent(userInput);

    return encoded.replace("%24", "$").replace("%7D", "}"); // add more here
}

I would still suggest to limit the replacement, leaving out of this 2nd-pass replacement ? for example (a known query-string delimiter)

@awalterschulze
Copy link
Member

Ok so you want to use encodeURIComponent and then "decode" some of the encoded characters?
I just want to make sure I understand you clearly.
Instead of only encoding some of the characters.
This way their will be a better catch all, but we can still try and make the URL readable and editable by "decoding" most of the characters.
Is that what you meant?

@gdm85
Copy link
Author

gdm85 commented Feb 24, 2016

@awalterschulze yes that! :)

@awalterschulze
Copy link
Member

Ok so characters that would be decoded include:

, } { <all ascii characters> "

Does that sound about right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants