Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow numeric header names #153

Closed

Conversation

weierophinney
Copy link
Member

Q A
Documentation no
Bugfix yes
BC Break yes
New Feature no
RFC no
QA no

Description

RFC 7230 allows for numeric header names, both integers and floats, though the expectation is that they are provided as string values. Since PHP casts integer strings into integers for purposes of array keys, this can lead to a scenario where HeaderSecurity was then flagging the value as invalid - despite the fact that the regex used on strings clearly allows the value. This patch modifies HeaderSecurity::assertValidName() to allow for numeric names, and casts them to a string when performing validations.

Marking as a BC break as there is a subtle behavior change.

Fixes #11

@weierophinney weierophinney added this to the 3.0.0 milestone May 2, 2023
@weierophinney weierophinney added BC Break Bug Something isn't working labels May 2, 2023
@@ -148,16 +148,16 @@ public static function assertValid(mixed $value): void
*/
public static function assertValidName(mixed $name): void
{
if (! is_string($name)) {
if (! is_string($name) && ! is_numeric($name)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect them to be string anyway: casting a float to string is not our business, and may lead to trouble down the road.

Changing the regex: sure.

Casting float to a string is IMO a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only integer like values are affected. Negative and positive. Integers that start with 0 are not converted, except singular 0.

https://3v4l.org/ei8MB

array(6) {
  [1]=>
  string(0) ""
  [-1]=>
  string(0) ""
  [0]=>
  string(0) ""
  ["0.1"]=>
  string(0) ""
  ["0123"]=>
  string(0) ""
  ["00"]=>
  string(0) ""
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is with casts that can also lead to a scientific notation

Thinking https://stackoverflow.com/q/1471674/347063

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is as follows:

  • ServerRequestFactory calls on Lamins\Diactoros\marshalHeadersFromSapi().
  • Laminas\Diactoros\marshalHeadersFromSapi() loops through the $_SERVER array, looking for keys that match one of:
    • REDIRECT_
    • CONTENT_
    • HTTP_
  • Keys that are of interest are then added to an array of discovered headers and returned to the factory.
  • The factory uses the discovered headers to construct a ServerRequest instance
  • ServerRequest takes that list of headers and passes them to HeaderSecurity to validate the names and values.

And this is where the issue comes in, because PHP converts keys that are strings containing only integers to integers. This happens at the point that marshalHeadersFromSapi() returns the header list, so once the ServerRequest starts to inject them into itself, they then fail validations.

I'm fine with not casting float to string, and can make that change, but the integer values are the ones that this patch is really concerned about. If I make the change to remove support for floats, will this resolve your reservations, @Ocramius ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHP only casts pure integer strings to int. Different kinds of notations like octal and such are not. So only -?[0-9]+ is of concern. Anything else is not allowed.

@Xerkus
Copy link
Member

Xerkus commented May 2, 2023

I propose directly opposite: ban numeric headers as they are changing types and they will cause type errors somewhere unless name is always cast to string at every single place header name is obtained from array key.

Psalm will complain about casts, users will never do it. I know of this issue and I never do it.
The error is not of the kind that can be captured in developemnt or in QA because noone sane will ever try to inject numeric header in every single place headers are used in the codebase.

@Xerkus
Copy link
Member

Xerkus commented May 2, 2023

I would like to call attention again to https://3v4l.org/XUj6h showcasing the issue

@weierophinney weierophinney force-pushed the feature/numeric-header-names branch 2 times, most recently from d012ac4 to df5279a Compare May 3, 2023 13:53
@Xerkus
Copy link
Member

Xerkus commented May 3, 2023

Type error places with numeric headers present (needs string cast):

$filtered = str_replace('-', ' ', $header);

$headerNames[strtolower($header)] = $header;

if (strtolower($header) === 'host') {

@Xerkus
Copy link
Member

Xerkus commented May 3, 2023

Reflect potential int

-     * @psalm-return array<non-empty-string, list<string>>
+     * @psalm-return array<non-empty-string|int, list<string>>

* @psalm-return array<non-empty-string, list<string>>
*/
public function getHeaders(): array
{
return $this->headers;
}

@weierophinney
Copy link
Member Author

I've now updated to ensure we only consider integer values, and I've also put in some protections to ensure that non-assoc arrays cannot be used to populate the header list (e.g., via the ServerRequest factory).

}

$i = -1;
foreach ($array as $k => $v) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fail for empty array

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - for the same reason array_as_list() returns true for an empty array as well. Added a gate to this method.

RFC 7230 allows for numeric header names, both integers and floats, though the expectation is that they are provided as string values.
Since PHP casts integer strings into integers for purposes of array keys, this can lead to a scenario where `HeaderSecurity` was then flagging the value as invalid - despite the fact that the regex used on strings clearly allows the value.
This patch modifies `HeaderSecurity::assertValidName()` to allow for numeric names, and casts them to a string when performing validations.

Fixes laminas#11

Signed-off-by: Matthew Weier O'Phinney <[email protected]>
Both `RequestTest` and `RequestTest` had tests for invalid headers that included a case for `indexed-array`, which is no longer considered invalid.

Signed-off-by: Matthew Weier O'Phinney <[email protected]>
String float values are not cast to floats when used as array keys, unlike integers, so we do not need to treat them differently.
What that means, though, is that we _should_ manage detection of list arrays versus associative arrays, but otherwise allow integer array keys.
This patch does the following:

- Adds detection of lists passed for the headers array into `MessageTrait::setHeaders()`.
  - This is accomplished with a polyfill for `array_is_list`, that also considers an empty array as valid.
- Removes the `is_numeric()` check for the header name, replacing it with `is_int()`.
- Only cast the header name to a string before appying the regex if it is an int.

Signed-off-by: Matthew Weier O'Phinney <[email protected]>
Check to see if we have a string header name before performing string operations.
They can either be bypassed, or performed on a string cast of the header name.

Signed-off-by: Matthew Weier O'Phinney <[email protected]>
Return early on detection of empty array, as PHP 8 was considering it a list.

Signed-off-by: Matthew Weier O'Phinney <[email protected]>
Because PHP 8.0 does not have `array_is_list()`, Psalm cannot infer what the return value from such a function would be.

Signed-off-by: Matthew Weier O'Phinney <[email protected]>
@weierophinney
Copy link
Member Author

weierophinney commented May 3, 2023

For reviewers: I can totally see why this might be considered a potentially bad idea, due to how PHP munges integer and string-integer keys, and the potential for accidentally setting the headers from a list versus an associative array. However, please note that MessageTrait::setHeaders() now explicitly looks for a non-empty list, and considers that invalid.

Second, there is a very real concern here. The following is a perfectly valid HTTP request:

GET /some/path HTTP/1.1
Host: example.org
1: some value
Accept: application/json

However, currently, this would lead to a 500 with Diactoros because of that integer key... which RFC 7230 clearly allows in its header name ABNF.

With the protections against accepting lists, I think the changes here make sense, and definitely solve the issue reported in #11.

@Ocramius
Copy link
Member

Ocramius commented May 3, 2023

I'm OK with restricting this to integers, for now

Copy link
Member

@boesing boesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure if the array_is_list check makes sense.

This throws no exception:

[-1 => 'foo', 0 => 'bar', 1 => 'baz']

This throws exception:

[0 => 'bar', 1 => 'baz']

But maybe I missed some detail.


Is there any plan to raise this via FIG?
Might be even worth again to ping this to PHP via RFC, I have had the same problem with my typed arrays library and thus, I'd love to see an RFC which actually enforces to keep key-type. Would perfectly fit in PHP 9 to enforce.

*/
protected static function filterHeader($header): string
{
$filtered = str_replace('-', ' ', $header);
$filtered = is_int($header) ? (string) $header : $header;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would just cast always to string, since we have to check is_int, directly casting int and string might be even "faster".

return false;
}

if (function_exists('array_is_list')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we just drop PHP <8.1?
Did that in laminas-servicemanager for v4 as well, already seen other components dropping 8.0 so I guess that would fit?

@@ -327,13 +330,18 @@ private function getStream($stream, string $modeIfNotInstance): StreamInterface
*/
private function setHeaders(array $originalHeaders): void
{
if ($this->arrayIsList($originalHeaders)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this check?

Is 0 disallowed in the RFC mentioned in the comments or why are numeric-only headers not allowed?
and if 0 is disallowed, why is -1, 0, 1 allowed (which is not a list but still containers only numeric keys)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on a polyfill for array_is_list() that mimics the behavior of that function... which only returns true if the keys are integers, are in sequence, and the sequence starts with 0. As such, keys starting at -1 will cause that function to return false.

All that said... I've noted on the Slack my reservations about the approach in this patch, and have asked for some guidance on an alternate approach.

@weierophinney
Copy link
Member Author

I'm going to withdraw this patch, and propose an alternate solution based on discussions in the Laminas Slack.

@weierophinney weierophinney removed this from the 3.0.0 milestone May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BC Break Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants