Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PaStream type-safe #917

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

dechamps
Copy link
Contributor

This PR contains two commits:

  • The first commit fixes PaStream being void is type-unsafe and error-prone #915 by changing PaStream to be a type-safe opaque type, instead of void (pointers to which can be implictly converted from anything).
  • The second commit goes a step further and maps it to PaUtilStreamRepresentation, simplifying the implementation by removing some casts.
    • That one is more of a cosmetic change; I'm fine dropping it is people disagree.

This changes PaStream* from a pointer to void to a pointer to an
undefined, but unique type. This ensures functions that take a PaStream*
reject incompatible types, for improved type safety.

While this is technically an API change, it is unlikely to break anyone
as it's unlikely user code depends on this type being void specifically
(unless they suffer from the bugs that this change is designed to
prevent!).

Fixes PortAudio#915
@RossBencina
Copy link
Collaborator

RossBencina commented May 31, 2024

Phil and I are happy to improve the type-safety, and approve of the first version because it retains PaStream as an opaque type with no definition.

typedef struct PaStream PaStream;

We reject the second version because it exposes a non-public detail of the implementation (the name of an internal implementation type).

Phil has confirmed that the first version (use of incomplete type in the interface, then casting to a differently named struct in the implementation) is a widely used idiom in the Android NDK. However I am suspicious that it is not valid C, since to me it entails a strict aliasing violation: I don't believe you can cast directly between unrelated differently named types even if they are structurally compatible. I'm happy to be wrong about this, but have yet to find a source that says that it is valid ISO standard C. Although I might be confusing C and C++ aliasing rules.

I think my concern can be addressed by declaring PaStream as an incomplete type in portaudio.h:

typedef struct PaStream PaStream;

And then in the implementation (pa_stream.h) define a union:

struct PaStream {}; // now declared as an empty complete type

typedef union PaStreamStorage {
    PaStream stream;
    PaUtilStreamRepresentation rep;
} PaStreamStorage ;

Allocations should then always be performed via PaStreamStorage, and casts performed via the union:

PaStream* openstream()
{
    PaStreamStorage *storage = (PaStreamStorage*)malloc(sizeof(PaStreamStorage));
    PaUtilStreamRepresentation *rep = &storage->rep;
    // init rep here
    return &storage->stream;
}

void foo (PaStream *s)
{
    PaUtilStreamRepresentation *prep = &(((PaStreamStorage*)s)->rep); // wrap in macro
    // use rep here
}

Not sure whether this is overkill but I believe that it is valid ISO standard C, while your first version (which we prefer in all other respects) may not be.

@RossBencina
Copy link
Collaborator

RossBencina commented May 31, 2024

Accepted answer on stack overflow https://stackoverflow.com/a/25672839/2013747

To re-iterate, type-punning through unions is perfectly fine in C (but not in C++). In contrast, using pointer casts to do so violates C99 strict aliasing and is problematic because different types may have different alignment requirements and you could raise a SIGBUS if you do it wrong. With unions, this is never a problem.

My emphasis. So we should do the union thing I think.

P.S. a choice quote from the aforementioned link:

"an underspecified semantic quagmire"

@dechamps
Copy link
Contributor Author

dechamps commented Jun 1, 2024

We reject the second version because it exposes a non-public detail of the implementation (the name of an internal implementation type).

I don't see why this is a problem because there is nothing a user can do with that name. But fair enough, that's not a hill I plan to die on. Removed the second commit from the PR.

However I am suspicious that it is not valid C, since to me it entails a strict aliasing violation: I don't believe you can cast directly between unrelated differently named types even if they are structurally compatible.

This is tricky indeed. The best explanation I could find on this is this one. My understanding is:

  • Casting in itself cannot trigger undefined behavior; it's the type used to access the value that counts.
    • So, for example, casting an A to a B, then to A again, and then accessing the result as an A is perfectly fine, even if B is not compatible with A.
    • This means we shouldn't focus on the casts. We should focus on which type is used to actually read/write the object.
  • This PR cannot introduce new strict aliasing violations because it does not change the effective type of a PaStream object (which is decided on first access after allocation, typically by host API code), nor does it change the type of any of the expressions that access the PaStream (e.g. PaUtilStreamRepresentation, PaWasapiStream, etc.). It may adjust a few casts but not in ways that have any impact on the type used to ultimately access the object.
    • This means there is no need to define PaStream as an empty struct, or an union - none of this matters, because PaStream is not used as the type of any expression that accesses the object.
      • By the way, empty structs are not allowed in C, and if you really wanted an union, you would have to consider Host API specific stream types e.g. PaWasapiStream etc which will get very messy quick.

(Honestly I would be more worried about strict aliasing violations in the current PortAudio implementation code. For example, in OpenStream() Host APIs allocate streams as e.g. PaWasapiStream*, then cast that to PaStream*, then pa_front converts that to PaUtilStreamRepresentation*. Does that violate the strict aliasing rule? I'm honestly not sure. PaUtilStreamRepresentation is not "an aggregate type or union type type that includes PaWasapiStream among its members" so it looks like the answer may be yes, but the C standard also says in 6.7.2.1.15 "A pointer to a structure object, suitably converted, points to its initial member and vice versa", and PaUtilStreamRepresentation is the first member of PaWasapiStream so… which is it?? This may hinge on the precise meaning of "suitably converted", which sadly doesn't seem to be defined anywhere. Ugh.)

@dechamps
Copy link
Contributor Author

dechamps commented Jun 3, 2024

This may hinge on the precise meaning of "suitably converted", which sadly doesn't seem to be defined anywhere. Ugh.

Curiosity got the better of me so I asked about it on Stack Overflow. It looks like I'm not the only one finding the standard unclear on this. It's probably fair to say that compilers have to keep that kind of pattern working, as the alternative would be to watch the world explode.

Anyway, to be clear, this self-inflicted disgression has nothing to do with the PR itself, which I still believe is safe - or at the very least it doesn't make the code any more problematic than it already is.

@RossBencina
Copy link
Collaborator

Casting in itself cannot trigger undefined behavior

But casting between unrelated types (or pointers to unrelated types) is simply illegal. There is no requirement for pointers to unrelated types to be interconvertible at all. I thought that was clear from the discussion in the link I provided. To be "related", I believe that the types have to either (1) inhabit the same union, or (2) A has to be the first member of B (hence share the same address). That is why the union is needed.

By the way, empty structs are not allowed in C

Ok, so if we do struct PaStream { int dummy; }; is that going to work?

@RossBencina
Copy link
Collaborator

RossBencina commented Jun 3, 2024

Not sure what I think about this, but here is another possibility for discussion: instead of a union, update pa_stream.h as follows:

struct PaStream {
    unsigned long magic;    /**< set to PA_STREAM_MAGIC */
};

typedef struct PaUtilStreamRepresentation {
    PaStream stream; // was: unsigned long magic;
    struct PaUtilStreamRepresentation *nextOpenStream; /**< field used by multi-api code */
    PaUtilStreamInterface *streamInterface;
    PaStreamCallback *streamCallback;
    PaStreamFinishedCallback *streamFinishedCallback;
    void *userData;
    PaStreamInfo streamInfo;
} PaUtilStreamRepresentation;

Obviously code that initializes magic will have to be tweaked.

@dechamps
Copy link
Contributor Author

dechamps commented Jun 3, 2024

But casting between unrelated types (or pointers to unrelated types) is simply illegal. There is no requirement for pointers to unrelated types to be interconvertible at all.

This is not what the standard says. C17 6.3.2.3 p.7:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

Given an undefined struct has no alignment requirements (AFAIK), converting a PaUtilStreamRepresentation* (or a PaWasapiStream*, or anything else really) to a struct PaStream* is absolutely fine, and converting it back again to its original type is perfectly fine as well.

Again: aside from alignment considerations (which are irrelevant here), the casts don't matter. It is the pointer type used when actually dereferencing the pointer that matters (C17 6.5 p.7).

I thought that was clear from the discussion in the link I provided.

The discussion you linked is focused on unions, not pointer conversions. All the citations from the standard in the accepted answer are about unions; there is no discussion of pointer conversions aside from one sentence that is not backed by any citations at all.

(Also, I suspect that discussion you linked uses a more narrow definition of "type punning", in which an object is deliberately accessed through different incompatible types. With that definition, it is indeed correct to say that "using pointer casts to do so violates strict aliasing", and it is indeed preferable to use an union. That is not relevant to what this PR is doing, though.)

here is another possibility for discussion: instead of a union, update pa_stream.h as follows:

At this point Ross, how would you feel about renaming PaUtilStreamRepresentation to just PaStream?

typedef struct PaStream {
    unsigned long magic;
    struct PaStream *nextOpenStream; /**< field used by multi-api code */
    PaUtilStreamInterface *streamInterface;
    PaStreamCallback *streamCallback;
    PaStreamFinishedCallback *streamFinishedCallback;
    void *userData;
    PaStreamInfo streamInfo;
} PaStream;

This would reduce the need for casts in the pa_front implementation, just like in the first version of the PR, but this time the name "exposed" in the header would be PaStream, not some internal-looking PaUtilStreamRepresentation. Seems like this would be the best of all worlds?

@RossBencina
Copy link
Collaborator

RossBencina commented Jun 7, 2024

But casting between unrelated types (or pointers to unrelated types) is simply illegal. There is no requirement for pointers to unrelated types to be interconvertible at all.

This is not what the standard says. C17 6.3.2.3 p.7:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

Thanks for correcting me on this. Similar wording appears in the C89 standard (6.3.4 Cast operators):

A pointer to an object or incomplete type may be converted to a pointer to a different
object type or a different incomplete type. The resulting pointer might not be valid if it is
improperly aligned for the type pointed to. It is guaranteed, however, that a pointer to an
object of a given alignment may be converted to a pointer to an object of the same alignment or a less strict alignment and back again: the result shall compare equal to the original pointer (An object that has character type has the least strict alignment.)

You continue:

Given an undefined struct has no alignment requirements (AFAIK), converting a PaUtilStreamRepresentation* (or a PaWasapiStream*, or anything else really) to a struct PaStream* is absolutely fine, and converting it back again to its original type is perfectly fine as well.

Let's be more precise here, since C compilers compile compilation units separately, the matter of whether PaStream has a definition somewhere is not relevant, only the visibility of a definition is relevant, therefore we are not dealing with a pointer to an "undefined struct" but rather a pointer to an "incomplete type" (struct PaStream*). I would say that in this case PaStream has an unknown alignment requirement not "no" alignment requirement (the accepted SO answer here seems to agree).

Now, I don't think the quotes from the standard that you and I have provide above speak to the question of interconvertability of pointers with unknown alignment requirement, and I am not convinced that your reasoning is sound, for if it were, all pointers to structs would need to have the same representation as void* which clearly they are not required to do. The accepted answer linked above would tend to agree:

any conversion of a pointer to an incomplete type to a type other than char* or void* must be regarded as unportable.

Again: aside from alignment considerations (which are irrelevant here), the casts don't matter. It is the pointer type used when actually dereferencing the pointer that matters (C17 6.5 p.7).

You have persuaded me that the strict aliasing business is only relevant when accessing struct storage (dereferencing pointers). But as should be clear above, the casts appear to me to be the sticking point.

I thought that was clear from the discussion in the link I provided.

The discussion you linked is focused on unions, not pointer conversions. All the citations from the standard in the accepted answer are about unions; there is no discussion of pointer conversions aside from one sentence that is not backed by any citations at all.

Unions are the only way to violate strict aliasing without introducing undefined behavior. But I agree with your argument that if we are not going to define PaStream then there is no chance of dereferencing it and violating strict aliasing.

(Also, I suspect that discussion you linked uses a more narrow definition of "type punning", in which an object is deliberately accessed through different incompatible types. With that definition, it is indeed correct to say that "using pointer casts to do so violates strict aliasing", and it is indeed preferable to use an union. That is not relevant to what this PR is doing, though.)

It is now potentially relevant only insofar as we need to implement something that is portable and I don't believe that what we have now is portable for the reasons I explained at the beginning of this comment.

I'm sorry if this sounds like I'm moving the goal posts (I am). It is only because you have corrected an aspect that I was incorrect about in my earlier comments.

here is another possibility for discussion: instead of a union, update pa_stream.h as follows:

At this point Ross, how would you feel about renaming PaUtilStreamRepresentation to just PaStream?

typedef struct PaStream {
    unsigned long magic;
    struct PaStream *nextOpenStream; /**< field used by multi-api code */
    PaUtilStreamInterface *streamInterface;
    PaStreamCallback *streamCallback;
    PaStreamFinishedCallback *streamFinishedCallback;
    void *userData;
    PaStreamInfo streamInfo;
} PaStream;

This would reduce the need for casts in the pa_front implementation, just like in the first version of the PR, but this time the name "exposed" in the header would be PaStream, not some internal-looking PaUtilStreamRepresentation. Seems like this would be the best of all worlds?

Phil and I will discuss this suggestion further. Based on last week's discussion I don't think it is ideal as we would prefer PaStream to remain opaque. But given everything we've discussed, that may be unachievable.

@RossBencina
Copy link
Collaborator

RossBencina commented Jun 7, 2024

How about we change the definition of PaStream to:

typedef struct PaStream { /* opaque structure definition with minimal alignment requirement */
    unsigned char reserved;
} PaStream;

This should allow PaStream* to be interconvertible with any other struct ptr as per the C standard excerpts cited earlier. Not sure about any down sides.

In fact, there are two related options that I think would be legal: (1) put the above in portaudio.h or (2) put a forward declaration in portaudio.h and put the definition in pa_stream.h.

@philburk
Copy link
Collaborator

philburk commented Jun 7, 2024

how would you feel about renaming PaUtilStreamRepresentation to just PaStream?

I think that would work. But we would have to do a lot of editing internally. Not a big deal.

I also like Ross' last suggestion of using the "char reserved".

I can go with either one.

@RossBencina
Copy link
Collaborator

RossBencina commented Jun 7, 2024

Phil wrote:

I can go with either one.

We have two existing constraints for the public API:

  1. Portable code that is correct C as per the relevant ISO C standards.
  2. Data hiding: make it clear to the client that PaStream is an opaque type, that the representation is not part of the portaudio public API contract.

And this PR attempts to add a third constraint:

  1. Make the definition of PaStream type safe to the extent that it is more difficult for the user to make mistakes.

Constraint (1) (correctness) is not negotiable.

The current implementation (typedef PaStream void) is the best possible realisation of (2) (data hiding) -- it makes the minimum specification of what a PaStream is, fully opaque. This makes it extremely difficult for inexperienced users to confuse what is public and private, which I think is at least as important as type safety. (I just know that we're going to see people poking around in the internal stream representation if we expose even the name of the struct thus making it easy to find its definition.)

Any proposal that somehow equates PaStream with the representation data structure reduces data hiding (this includes renaming PaUtilStreamRepresentation to PaStream and the earlier ideas about unions.)

Any proposal that changes PaStream from void to struct reduces opacity.

From the options so far proposed, I prefer:

portaudio.h:

typedef struct PaStream PaStream;

pa_stream.h:

struct PaStream{
    unsigned char reserved;
};

This achieves maximum opacity while introducing type safety and ensuring correctness. (Correctness proof: PaStream has alignment requirement 1, which is guaranteed by the standard to be interconvertible with any other pointer, so we can use PaStream* as an opaque pointer to the stream representation.)

If we decide to go forward with this approach we will need to include some explanatory doc comments.

One final question/wrinkle: does changing the definition of PaStream from void to struct PaStream introduce an ABI breakage? Does the C standard guarantee that sizeof(void*) and sizeof(PaStream*) are the same?

@dechamps
Copy link
Contributor Author

dechamps commented Jun 9, 2024

Let's be more precise here, since C compilers compile compilation units separately, the matter of whether PaStream has a definition somewhere is not relevant, only the visibility of a definition is relevant, therefore we are not dealing with a pointer to an "undefined struct" but rather a pointer to an "incomplete type" (struct PaStream*). I would say that in this case PaStream has an unknown alignment requirement not "no" alignment requirement (the accepted SO answer here seems to agree).

Now, I don't think the quotes from the standard that you and I have provide above speak to the question of interconvertability of pointers with unknown alignment requirement, and I am not convinced that your reasoning is sound, for if it were, all pointers to structs would need to have the same representation as void* which clearly they are not required to do.

That's an interesting point - I'll admit I kinda hand-waved the alignment requirements of a pointer to an incomplete struct because I intuitively expected that to always work, but it looks like I should have dug deeper into that particular rabbit hole.

That said… the answer you linked to does mention a specific saving provision in the standard which I believe means the approach I suggested is still valid regardless:

All pointers to structure types shall have the same representation and alignment requirements as each other. (C17 §6.2.5/27)

I believe this citation ends this discussion about alignment requirements once and for all, doesn't it? Indeed, PortAudio always converts PaStream from/to a pointer to another struct (e.g. PaUtilStreamRepresentation, PaWasapiStream). According to that citation, that can never fail alignment requirements, ever, regardless of how the struct is defined, and indeed regardless of whether the struct is even defined in the first place.

If PortAudio was converting between PaStream and a pointer to some non-struct type (such as a union, or a non-composite type like int), then we may indeed have a problem, but it never does, so… I believe this is all perfectly fine in the end.

(What an interesting discussion! I did not expect to learn so much about the intricacies of aliasing and pointer conversions in standard C. The fact that the alignment requirements of a struct are not dictated by the alignment requirements of its members is definitely news to me. In retrospect I can see why it's designed that way - it's probably to make sure one can change the definition of a struct at will without that causing surprising ripple effects to the representation/alignment of pointers to that struct.)

typedef struct PaStream {
    unsigned long magic;
    struct PaStream *nextOpenStream; /**< field used by multi-api code */
    …
} PaStream;

Based on last week's discussion I don't think it is ideal as we would prefer PaStream to remain opaque.

I don't understand your objection? In my proposal PaStream would remain opaque to PortAudio users. To be clear, this struct definition I proposed would not appear in the public PortAudio header files - it would remain confined to PortAudio internal code. PortAudio users would not have access to the definition of the struct, which will remain incomplete as far as user code is concerned.

In my opinion this is the best approach because it keeps everything opaque to PortAudio users, the header file stays "clean" (i.e. no mention of  PaUtilStreamRepresentation) and as a nice bonus we get to remove a number of casts in PortAudio internals.

How about we change the definition of PaStream to:

typedef struct PaStream { /* opaque structure definition with minimal alignment requirement */
    unsigned char reserved;
} PaStream;

As I explain above, I don't think this is necessary, as there is no need to work around any alignment issues. Given this, this approach would seem more confusing than useful.

(I just know that we're going to see people poking around in the internal stream representation if we expose even the name of the struct thus making it easy to find its definition.)

I don't buy this argument, Ross. If users really want to poke around with the internal representation, they can already look at PortAudio code and from there it's already fairly obvious what the internal representation is, given PortAudio converts the PaStream to a PaUtilStreamRepresentation* immediately on entry. Getting rid of the conversion only makes it very slightly easier for users to do that. I think it's very clear to reasonable users that they are not supposed to peek into a PaStream, given the definition of the struct is not exposed in any PortAudio header file, and it should be quite obvious to everyone that it's meant to be an opaque handle. If a user is determined enough to ignore all that and rely on PortAudio implementation details anyway, there's only so much we can do to stop them, and I don't think it makes sense to compromise the quality of PortAudio internal code to do just that.

From the options so far proposed, I prefer:

portaudio.h:

typedef struct PaStream PaStream;

pa_stream.h:

struct PaStream{
    unsigned char reserved;
};

This achieves maximum opacity while introducing type safety and ensuring correctness. (Correctness proof: PaStream has alignment requirement 1, which is guaranteed by the standard to be interconvertible with any other pointer, so we can use PaStream* as an opaque pointer to the stream representation.)

And my counter-proposal, as explained above, is:

portaudio.h:

typedef struct PaStream PaStream;

pa_stream.h:

struct PaStream {
   unsigned long magic;
   struct PaStream *nextOpenStream; /**< field used by multi-api code */
   PaUtilStreamInterface *streamInterface;
   PaStreamCallback *streamCallback;
   PaStreamFinishedCallback *streamFinishedCallback;
   void *userData;
   PaStreamInfo streamInfo;
};

Which I believe has the same properties as your proposal, but with the additional bonus of simplifying and clarifying PortAudio internal code.

One final question/wrinkle: does changing the definition of PaStream from void to struct PaStream introduce an ABI breakage? Does the C standard guarantee that sizeof(void*) and sizeof(PaStream*) are the same?

Good question. C17 §6.2.5/28 says:

A pointer to void shall have the same representation and alignment requirements as a pointer to a character type.

I don't think pointer to character types are required to have the same representation as pointers to structs. In fact, the SO answer you linked to discusses hypothetical architectures where pointers to structs could be made smaller than pointers to chars, due to the fact that an architecture can define the minimum alignment of a struct to be stricter than the minimum alignment of a char, and therefore the last bits can be assumed to be zero without having to store them.

Given this, one could argue the proposed change is technically an ABI break, but I don't think that's true in practice for any of the architectures anyone could realistically run PortAudio on. I don't think any architecture used in the real world today would have sizeof(struct X*) != sizeof(void*), as this would be highly esoteric behavior which would likely "break the world" in practice. That seems highly unlikely, and it seems even less likely that such an architecture would be supported by any of the OSes that PortAudio currently targets. Therefore ABI breakage concerns seem purely academic to me and will not materialize in reality. (This is in contrast to discussions about undefined behavior, which are not just academic because they affect optimizations compilers are allowed to deploy.)

@RossBencina
Copy link
Collaborator

All pointers to structure types shall have the same representation and alignment requirements as each other. (C17 §6.2.5/27)

I believe this citation ends this discussion about alignment requirements once and for all, doesn't it?

I'm afraid not. It is interesting that the pointers all have the same representation (whatever that means), but the alignment requirement mentioned in your quote is the alignment requirement of the pointer-to-the-struct, whereas the interconvertability of pointers hinges on the aligment requirements of the pointed-to-objects (in our case PaStream and PaUtilStreamRepresentation). Needless to say all structures do not have the same alignment requirements.

@dmitrykos
Copy link
Collaborator

dmitrykos commented Jun 10, 2024

Let me add some cents into the discussion. First of all, - there is no any type safety in C. PA is C library, so there is no type safety in PA and can't be achieved, no one shall assume that type safety exists. Even though modern compiler may show warning about incompatible type - it is done at sole discretion of the compiler to make life of C developers easier of course. So, one compiler may show warning but another may not show.

In C there are no type casts like in C++ - C has static and weakly typed type system. If you know C++ than C-style cast is just like reintepret_cast<T> in C++ - nothing is taken into account, like alignment of the structure and its members, and etc. By converting pointer of one type to another you get pointer pointing to another type with the same address in the memory. Compiler will always succeed compiling code which is doing cast of incompatible types, even though it may show warning, such warning is still optional.

This PR shall not mention type safety as it misguides reader, but instead it shall mention - to allow optional C compiler warnings related to incompatible type conversion.

Opaque pointer technique's main purpose is to hide implementation from the user - data abstraction and encapsulation problem. It can not add any type safety to C code as mentioned above, but it may additionally cause side effect of compiler warning related to incompatible type conversion.

Implementation:

This opaque type declaration is not correct:

typedef struct PaStream PaStream;

Opaque pointer technique declares incomplete structure struct PaStreamImp as type PaStream in public header:

typedef struct PaStreamImp PaStream;

then inside source file we implement it:

typedef struct _PaStreamImp
{
    // my implementation goes here
} PaStreamImp;

In case of PA library it would be:

typedef struct PaUtilStreamRepresentation PaStream;

PaUtilStreamRepresentation is the top-level structure, so we can assume it is the very basic structure of PaStream type implementation.

Normally, PaStream is converted to the hostapi's implementation which is not PaUtilStreamRepresentation, like in WASAPI it is PaWasapiStream, so compiler will complain anyway on attempt to assign to PaWasapiStream pointer the PaStream pointer. So I do not see any usefulness from this PR for the developer of host API and how it can improve code safety, nor it is the same useful for user-side code.

Just recently I completed my MSc Dissertation in Software Engineering, University of Oxford, accompanying STK project, which was covering exactly these topics, so you may find it interesting to read:

  1. About type safety C vs C++: page 59, [5.2 C++ type system]
  2. About opaque pointer technique: page 66, [5.3.2 Data abstraction and encapsulation]
  3. In general whole chapter [5 C++ vs C] starting from page 56 can be interesting for this discussion

@dechamps
Copy link
Contributor Author

dechamps commented Jun 10, 2024

the alignment requirement mentioned in your quote is the alignment requirement of the pointer-to-the-struct, whereas the interconvertability of pointers hinges on the aligment requirements of the pointed-to-objects (in our case PaStream and PaUtilStreamRepresentation). Needless to say all structures do not have the same alignment requirements.

Ah, good point. This makes me sad :( Never mind then.

Let me then go back to your proposal:

From the options so far proposed, I prefer:

pa_stream.h:

struct PaStream{
    unsigned char reserved;
};

This achieves maximum opacity while introducing type safety and ensuring correctness. (Correctness proof: PaStream has alignment requirement 1

…I don't think that works, strictly speaking. As the SO answer you linked explains, the standard does not actually define any alignment requirement on structs. In other words, there is no actual guarantee that struct X { char x; } has the same alignment requirement as char (i.e. 1). According to that answer, a C implementation is allowed to enforce a minimum alignment requirement on a given struct that is stricter than its individual members would require, and there is even a somewhat plausible explanation as to why an implementation may want to do that (i.e. in architectures that use different pointer sizes, to be able to use the "coarse" size for all structs).

(Now to be clear, we are getting very deep into the weeds here and it seems highly implausible that any of this would actually matter in practice. But I agree with the general idea that we should strictly conform to standard C if at all possible.)

I'm wondering if, at this point, we have no choice but to actually define the struct as per my latest proposal (i.e. typedef struct PaStream PaStream with struct PaStream defined internally), or drop the entire thing altogether and keep it "type-unsafe".

Let me add some cents into the discussion. First of all, - there is no any type safety in C. PA is C library, so there is no type safety in PA and can't be achieved, no one shall assume that type safety exists. Even though modern compiler may show warning about incompatible type - it is done at sole discretion of the compiler to make life of C developers easier of course. So, one compiler may show warning but another may not show.

Sure, but I think everyone here is assuming a compiler configured to warn on incompatible pointer conversions. Anyone writing C without that warning enabled is just asking for trouble, and are not the intended audience for this PR.

This opaque type declaration is not correct:

typedef struct PaStream PaStream;

Why is it incorrect? (Assuming struct PaStream is defined internally)

In case of PA library it would be:

typedef struct PaUtilStreamRepresentation PaStream;

This is basically my original proposal, which I later altered to rename PaUtilStreamRepresentation to PaStream to mitigate concerns about mentioning an internal-sounding name in a public header (which to be clear is a purely cosmetic concern).

Normally, PaStream is converted to the hostapi's implementation which is not PaUtilStreamRepresentation, like in WASAPI it is PaWasapiStream, so compiler will complain anyway on attempt to assign to PaWasapiStream pointer the PaStream pointer. So I do not see any usefulness from this PR for the developer of host API and how it can improve code safety, nor it is the same useful for user-side code.

It is true that this change does not benefit Host API code, which will still have to convert between Host API specific stream structs and PaStream. The goal of the change is to improve the safety of user code, not PortAudio internal code. (Some of the proposals would simplify some of the pa_front internal code as a side effect.)

@dmitrykos
Copy link
Collaborator

dmitrykos commented Jun 10, 2024

In case of PA library it would be:

typedef struct PaUtilStreamRepresentation PaStream;

This is basically my original proposal, which I later altered to rename PaUtilStreamRepresentation to PaStream to mitigate concerns about mentioning an internal-sounding name in a public header (which to be clear is a purely cosmetic concern).

I think this is the way to go, i.e. without renaming, just use struct PaUtilStreamRepresentation directly that implies no any change inside the PA's implementation code.

I looked up your usage example from which this PR is originating and it seems here we have C++ to C bridge issue (imaginary). I.e. even though this change does not affect C code much (PA's or client's), in case of C++ situation becomes different and then C++ client's code will benefit from opaque PaStream type as obviously C++ compiler will enforce stricter type checking during compilation. Like in your example, you would not be able to compile the code.

Structure alignment of PaUtilStreamRepresentation and its members in this case is not a problem at all, because declared type is opaque and does not affect implementation (on PA side). PA returns pointer to PaUtilStreamRepresentation incapsulated inside the host's structure (which is always the first), so also not a problem and proposed change will work fine.

Another benefit of using opaque pointer like this during the debugging is that some debuggers, like MSVC, will match/resolve opaque type to implementation type and will show useful info about field values of implementation type. So can be useful in this respect either.

Just revert back to typedef struct PaUtilStreamRepresentation PaStream;. It will be a breaking change for C++ client's code in case PaStream type was misused but it is good anyway (we all have to progress with time).

@dechamps
Copy link
Contributor Author

I think this is the way to go, i.e. without renaming, just use struct PaUtilStreamRepresentation directly that implies no any change inside the PA's implementation code.

This was in the first version of this PR, but it was rejected: #917 (comment)

Instead I'm trying to sell a variant of that where PaUtilStreamRepresentation is renamed to PaStream everywhere, which seems like the cleanest approach to me.

this change does not affect C code much (PA's or client's)

It does provide benefits to C users, because I would expect reasonable C users to have incompatible pointer warnings enabled.

@dmitrykos
Copy link
Collaborator

It does provide benefits to C users, because I would expect reasonable C users to have incompatible pointer warnings enabled.

Only if warnings are not silenced! ;)

I do not quite understand what implications are behind "We reject the second version because it exposes a non-public detail of the implementation (the name of an internal implementation type).". Maybe @RossBencina could elaborate on that. PA is open source project, so the name of PaUtilStreamRepresentation is known and can't be hidden, moreover what is the threat in using PaUtilStreamRepresentation name in opaque pointer definition?

@RossBencina
Copy link
Collaborator

At this point I have two concerns:

  1. Contrary to the view of our learned colleague (congratulations on your MSc Dmitry!), in my view the following does not define an opaque type:

    typedef struct PaUtilStreamRepresentation PaStream;

    It explicitly links the implementation with the interface. My position is that this is undesirable because in an open source code base it provides a direct path for the user to see into the implementation. This may be useful for PortAudio developers debugging but it is going to cause headaches with uninitiated users thinking that they can access the representation. What we have now (typedef void PaStream) is significantly more opaque. I discussed this with Phil last week and I believe that he agrees with this view.

    That said, I do agree that this is fairly obviously valid, clear, comprehensible C code that would work and I much prefer it to renaming the PaUtilStreamRepresentation struct to PaStream.

  2. Even if we decide that some variation on typedef struct X PaStream is the right way to go, I remain concerned that the change from typedef void PaStream to typedef struct X PaStream is an ABI breaking change, and if so should not be merged at this time (or any subsequent 19.x release). But I am quite open to being persuaded that this is a non-issue.

@dmitrykos
Copy link
Collaborator

  1. It explicitly links the implementation with the interface. My position is that this is undesirable because in an open source code base it provides a direct path for the user to see into the implementation.

Hi Ross! Opaque pointer technique does exactly like this actually and as long as user code does not include implementation code where PaUtilStreamRepresentation is declared then opaque pointer of PaStream does not disclose any members of PaUtilStreamRepresentation and data hiding principle is achieved perfectly.

What we have now (typedef void PaStream) is significantly more opaque.

Not really. In both cases, if user wants to access internal implementation, then he can include declaration of PaUtilStreamRepresentation and then cast void pointer or opaque pointer to that type.

If there is confusion about PaUtilStreamRepresentation name in general then it could be renamed to PaStreamBase for example, as PaUtilStreamRepresentation serves really as very basement for all host-side stream implementations.

@RossBencina
Copy link
Collaborator

data hiding principle is achieved perfectly

I understand that is your opinion, but it is not mine.

On the other hand, among the available options, I think typedef struct PaUtilStreamRepresentation PaStream; is the only available "type safe option"

The problem is that we would be changing the API from void* to struct PaUtilStreamRepresentation*` and I don't (yet) believe that this is a valid ABI-safe transformation of our external interface.

@philburk
Copy link
Collaborator

This has been a terrific conversation. Thanks for all the great input.

We have decided that there is a benefit to doing the new typedef. It will help programmers avoid mistakes involving passing the wrong pointers.

But it does add a small and unmeasurable risk of breaking binary compatibility on some platforms. That sort of break is very hard to debug.

So we think it would be better to defer this change to V20 when we will be making other API/ABI incompatible changes.

@philburk philburk added this to the V20 milestone Jun 21, 2024
@RossBencina
Copy link
Collaborator

As Phil says, we're going to defer merging this for now.

I'd just like to note a few things while they're fresh in my mind...

  1. The "risk of breaking binary compatibility" that Phil speaks of is the change of typedef void PaStream; to typedef struct X PaStream; in portaudio.h, irrespective of what X is. In effect this changes the declarations of most PortAudio API stream functions from accepting a void* to accepting a struct X* argument. The binary incompatibility potentially arises if an application is compiled against the new header file but links against a PortAudio library binary (e.g. shared object, DLL) that was compiled with the old header file -- since the C standard says nothing about transparent binary compatibility of these pointer types when compiled independently. And in general you wouldn't change a key definition in a header file and expect existing library binaries to just work without recompilation.

  2. It seems like the most portable and least controversial approach is (call it "version B"):

- typedef void PaStream;
+ typedef struct PaUtilStreamRepresentation PaStream;

Where struct PaUtilStreamRepresentation is defined in a private header.

The current PR (call it "version A") does:

- typedef void PaStream;
+ typedef struct PaStream PaStream;

Where struct PaStream is never defined. This is in some sense more "opaque" than version B but potentially maybe theoretically slightly less portable since the implementation would contain casts between struct PaStream* and struct PaRepresentation* where struct PaStream is never defined. I don't think we've reached consensus on version A vs. version B.

  1. A couple of interesting links on diverse pointer representations in C:

https://c-faq.com/ptrs/generic.html

https://c-faq.com/null/machexamp.html

Thank you.

@dechamps
Copy link
Contributor Author

The binary incompatibility potentially arises if an application is compiled against the new header file but links against a PortAudio library binary (e.g. shared object, DLL) that was compiled with the old header file -- since the C standard says nothing about transparent binary compatibility of these pointer types when compiled independently.

I agree in theory, but in practice I'd be extremely surprised if any platform in practical use today had an ABI calling convention where void* is returned differently from struct X*. The second link you cited seems to confirm that: when asked, the only examples people seem to be able to cite are extremely niche, long gone architectures of purely historical interest. It seems vanishingly unlikely that anyone would even attempt to run PortAudio on such systems.

@RossBencina RossBencina added the P4 Priority: Low label Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 Priority: Low
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PaStream being void is type-unsafe and error-prone
4 participants