-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make PaStream type-safe #917
base: master
Are you sure you want to change the base?
Conversation
This changes PaStream* from a pointer to void to a pointer to an undefined, but unique type. This ensures functions that take a PaStream* reject incompatible types, for improved type safety. While this is technically an API change, it is unlikely to break anyone as it's unlikely user code depends on this type being void specifically (unless they suffer from the bugs that this change is designed to prevent!). Fixes PortAudio#915
Phil and I are happy to improve the type-safety, and approve of the first version because it retains
We reject the second version because it exposes a non-public detail of the implementation (the name of an internal implementation type). Phil has confirmed that the first version (use of incomplete type in the interface, then casting to a differently named struct in the implementation) is a widely used idiom in the Android NDK. However I am suspicious that it is not valid C, since to me it entails a strict aliasing violation: I don't believe you can cast directly between unrelated differently named types even if they are structurally compatible. I'm happy to be wrong about this, but have yet to find a source that says that it is valid ISO standard C. Although I might be confusing C and C++ aliasing rules. I think my concern can be addressed by declaring
And then in the implementation (
Allocations should then always be performed via
Not sure whether this is overkill but I believe that it is valid ISO standard C, while your first version (which we prefer in all other respects) may not be. |
Accepted answer on stack overflow https://stackoverflow.com/a/25672839/2013747
My emphasis. So we should do the union thing I think. P.S. a choice quote from the aforementioned link:
|
I don't see why this is a problem because there is nothing a user can do with that name. But fair enough, that's not a hill I plan to die on. Removed the second commit from the PR.
This is tricky indeed. The best explanation I could find on this is this one. My understanding is:
(Honestly I would be more worried about strict aliasing violations in the current PortAudio implementation code. For example, in |
Curiosity got the better of me so I asked about it on Stack Overflow. It looks like I'm not the only one finding the standard unclear on this. It's probably fair to say that compilers have to keep that kind of pattern working, as the alternative would be to watch the world explode. Anyway, to be clear, this self-inflicted disgression has nothing to do with the PR itself, which I still believe is safe - or at the very least it doesn't make the code any more problematic than it already is. |
But casting between unrelated types (or pointers to unrelated types) is simply illegal. There is no requirement for pointers to unrelated types to be interconvertible at all. I thought that was clear from the discussion in the link I provided. To be "related", I believe that the types have to either (1) inhabit the same union, or (2) A has to be the first member of B (hence share the same address). That is why the union is needed.
Ok, so if we do |
Not sure what I think about this, but here is another possibility for discussion: instead of a union, update
Obviously code that initializes |
This is not what the standard says. C17 6.3.2.3 p.7:
Given an undefined struct has no alignment requirements (AFAIK), converting a Again: aside from alignment considerations (which are irrelevant here), the casts don't matter. It is the pointer type used when actually dereferencing the pointer that matters (C17 6.5 p.7).
The discussion you linked is focused on unions, not pointer conversions. All the citations from the standard in the accepted answer are about unions; there is no discussion of pointer conversions aside from one sentence that is not backed by any citations at all. (Also, I suspect that discussion you linked uses a more narrow definition of "type punning", in which an object is deliberately accessed through different incompatible types. With that definition, it is indeed correct to say that "using pointer casts to do so violates strict aliasing", and it is indeed preferable to use an union. That is not relevant to what this PR is doing, though.)
At this point Ross, how would you feel about renaming typedef struct PaStream {
unsigned long magic;
struct PaStream *nextOpenStream; /**< field used by multi-api code */
PaUtilStreamInterface *streamInterface;
PaStreamCallback *streamCallback;
PaStreamFinishedCallback *streamFinishedCallback;
void *userData;
PaStreamInfo streamInfo;
} PaStream; This would reduce the need for casts in the |
Thanks for correcting me on this. Similar wording appears in the C89 standard (6.3.4 Cast operators):
You continue:
Let's be more precise here, since C compilers compile compilation units separately, the matter of whether Now, I don't think the quotes from the standard that you and I have provide above speak to the question of interconvertability of pointers with unknown alignment requirement, and I am not convinced that your reasoning is sound, for if it were, all pointers to structs would need to have the same representation as
You have persuaded me that the strict aliasing business is only relevant when accessing struct storage (dereferencing pointers). But as should be clear above, the casts appear to me to be the sticking point.
Unions are the only way to violate strict aliasing without introducing undefined behavior. But I agree with your argument that if we are not going to define
It is now potentially relevant only insofar as we need to implement something that is portable and I don't believe that what we have now is portable for the reasons I explained at the beginning of this comment. I'm sorry if this sounds like I'm moving the goal posts (I am). It is only because you have corrected an aspect that I was incorrect about in my earlier comments.
Phil and I will discuss this suggestion further. Based on last week's discussion I don't think it is ideal as we would prefer |
How about we change the definition of
This should allow In fact, there are two related options that I think would be legal: (1) put the above in |
I think that would work. But we would have to do a lot of editing internally. Not a big deal. I also like Ross' last suggestion of using the "char reserved". I can go with either one. |
Phil wrote:
We have two existing constraints for the public API:
And this PR attempts to add a third constraint:
Constraint (1) (correctness) is not negotiable. The current implementation ( Any proposal that somehow equates Any proposal that changes From the options so far proposed, I prefer:
This achieves maximum opacity while introducing type safety and ensuring correctness. (Correctness proof: If we decide to go forward with this approach we will need to include some explanatory doc comments. One final question/wrinkle: does changing the definition of |
That's an interesting point - I'll admit I kinda hand-waved the alignment requirements of a pointer to an incomplete struct because I intuitively expected that to always work, but it looks like I should have dug deeper into that particular rabbit hole. That said… the answer you linked to does mention a specific saving provision in the standard which I believe means the approach I suggested is still valid regardless:
I believe this citation ends this discussion about alignment requirements once and for all, doesn't it? Indeed, PortAudio always converts If PortAudio was converting between (What an interesting discussion! I did not expect to learn so much about the intricacies of aliasing and pointer conversions in standard C. The fact that the alignment requirements of a struct are not dictated by the alignment requirements of its members is definitely news to me. In retrospect I can see why it's designed that way - it's probably to make sure one can change the definition of a struct at will without that causing surprising ripple effects to the representation/alignment of pointers to that struct.)
I don't understand your objection? In my proposal In my opinion this is the best approach because it keeps everything opaque to PortAudio users, the header file stays "clean" (i.e. no mention of
As I explain above, I don't think this is necessary, as there is no need to work around any alignment issues. Given this, this approach would seem more confusing than useful.
I don't buy this argument, Ross. If users really want to poke around with the internal representation, they can already look at PortAudio code and from there it's already fairly obvious what the internal representation is, given PortAudio converts the
And my counter-proposal, as explained above, is:
Which I believe has the same properties as your proposal, but with the additional bonus of simplifying and clarifying PortAudio internal code.
Good question. C17 §6.2.5/28 says:
I don't think pointer to character types are required to have the same representation as pointers to structs. In fact, the SO answer you linked to discusses hypothetical architectures where pointers to structs could be made smaller than pointers to chars, due to the fact that an architecture can define the minimum alignment of a struct to be stricter than the minimum alignment of a char, and therefore the last bits can be assumed to be zero without having to store them. Given this, one could argue the proposed change is technically an ABI break, but I don't think that's true in practice for any of the architectures anyone could realistically run PortAudio on. I don't think any architecture used in the real world today would have |
I'm afraid not. It is interesting that the pointers all have the same representation (whatever that means), but the alignment requirement mentioned in your quote is the alignment requirement of the pointer-to-the-struct, whereas the interconvertability of pointers hinges on the aligment requirements of the pointed-to-objects (in our case |
Let me add some cents into the discussion. First of all, - there is no any type safety in C. PA is C library, so there is no type safety in PA and can't be achieved, no one shall assume that type safety exists. Even though modern compiler may show warning about incompatible type - it is done at sole discretion of the compiler to make life of C developers easier of course. So, one compiler may show warning but another may not show. In C there are no type casts like in C++ - C has static and weakly typed type system. If you know C++ than C-style cast is just like This PR shall not mention type safety as it misguides reader, but instead it shall mention - to allow optional C compiler warnings related to incompatible type conversion. Opaque pointer technique's main purpose is to hide implementation from the user - data abstraction and encapsulation problem. It can not add any type safety to C code as mentioned above, but it may additionally cause side effect of compiler warning related to incompatible type conversion. Implementation: This opaque type declaration is not correct: typedef struct PaStream PaStream; Opaque pointer technique declares incomplete structure typedef struct PaStreamImp PaStream; then inside source file we implement it: typedef struct _PaStreamImp
{
// my implementation goes here
} PaStreamImp; In case of PA library it would be: typedef struct PaUtilStreamRepresentation PaStream;
Normally, Just recently I completed my MSc Dissertation in Software Engineering, University of Oxford, accompanying STK project, which was covering exactly these topics, so you may find it interesting to read:
|
Ah, good point. This makes me sad :( Never mind then. Let me then go back to your proposal:
…I don't think that works, strictly speaking. As the SO answer you linked explains, the standard does not actually define any alignment requirement on structs. In other words, there is no actual guarantee that (Now to be clear, we are getting very deep into the weeds here and it seems highly implausible that any of this would actually matter in practice. But I agree with the general idea that we should strictly conform to standard C if at all possible.) I'm wondering if, at this point, we have no choice but to actually define the struct as per my latest proposal (i.e.
Sure, but I think everyone here is assuming a compiler configured to warn on incompatible pointer conversions. Anyone writing C without that warning enabled is just asking for trouble, and are not the intended audience for this PR.
Why is it incorrect? (Assuming
This is basically my original proposal, which I later altered to rename
It is true that this change does not benefit Host API code, which will still have to convert between Host API specific stream structs and |
I think this is the way to go, i.e. without renaming, just use I looked up your usage example from which this PR is originating and it seems here we have C++ to C bridge issue (imaginary). I.e. even though this change does not affect C code much (PA's or client's), in case of C++ situation becomes different and then C++ client's code will benefit from opaque Structure alignment of Another benefit of using opaque pointer like this during the debugging is that some debuggers, like MSVC, will match/resolve opaque type to implementation type and will show useful info about field values of implementation type. So can be useful in this respect either. Just revert back to |
This was in the first version of this PR, but it was rejected: #917 (comment) Instead I'm trying to sell a variant of that where
It does provide benefits to C users, because I would expect reasonable C users to have incompatible pointer warnings enabled. |
Only if warnings are not silenced! ;) I do not quite understand what implications are behind "We reject the second version because it exposes a non-public detail of the implementation (the name of an internal implementation type).". Maybe @RossBencina could elaborate on that. PA is open source project, so the name of |
At this point I have two concerns:
|
Hi Ross! Opaque pointer technique does exactly like this actually and as long as user code does not include implementation code where
Not really. In both cases, if user wants to access internal implementation, then he can include declaration of If there is confusion about |
I understand that is your opinion, but it is not mine. On the other hand, among the available options, I think The problem is that we would be changing the API from |
This has been a terrific conversation. Thanks for all the great input. We have decided that there is a benefit to doing the new typedef. It will help programmers avoid mistakes involving passing the wrong pointers. But it does add a small and unmeasurable risk of breaking binary compatibility on some platforms. That sort of break is very hard to debug. So we think it would be better to defer this change to V20 when we will be making other API/ABI incompatible changes. |
As Phil says, we're going to defer merging this for now. I'd just like to note a few things while they're fresh in my mind...
Where The current PR (call it "version A") does:
Where
https://c-faq.com/ptrs/generic.html https://c-faq.com/null/machexamp.html Thank you. |
I agree in theory, but in practice I'd be extremely surprised if any platform in practical use today had an ABI calling convention where |
This PR contains two commits:
PaStream
beingvoid
is type-unsafe and error-prone #915 by changingPaStream
to be a type-safe opaque type, instead ofvoid
(pointers to which can be implictly converted from anything).PaUtilStreamRepresentation
, simplifying the implementation by removing some casts.