Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sv_vstring_get API around PERL_MAGIC_vstring #23075

Merged
merged 1 commit into from
Mar 18, 2025

Conversation

leonerd
Copy link
Contributor

@leonerd leonerd commented Mar 5, 2025

This PR adds a new API function, sv_vstring_get for interacting with what is currently implemented as PERL_MAGIC_vstring. By using this API callers no longer need to know about details of how that magic is implemented (or that it even is magic in the first place).

This is for similar reasons to the API added by #22971, as it gives future flexibility on how this behaviour is actually implemented.

I believe that the Storable module is supposed to be dual-life, so I added back-compat wrappers in Storable.xs to make it continue to work on older Perls before this API was added.

sv.c Outdated
{
PERL_ARGS_ASSERT_SV_VSTRING_SET;

sv_magic(sv, NULL, PERL_MAGIC_vstring, pv, len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant to happen if sv already has a vstring set?

sv_magic() will silently refuse to add the same magic again.

For normal assignment sv_force_normal_flags() removes the vstring magic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yes that's a fun question. The existing use-cases where this API was called in core or Storable.xs would know that the SV didn't already have the magic, but as a more general API it might not know that. Perhaps set should update any existing magic to change the stored values? This is going to mean a Safefree() and savepvn() call. Hrm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, thinking further: I can add some code for that case, but there's currently no code that would exercise it to test it. I guess I can add some in XS-APItest or somesuch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now with behaviour updated and some tests added.

Comment on lines 299 to 315
#ifndef sv_vstring_get
#define sv_vstring_get(sv,pvp,lenp) S_sv_vstring_get(aTHX_ sv,pvp,lenp)
static bool S_sv_vstring_get(pTHX_ SV *sv, const char **pvp, STRLEN *lenp)
{
MAGIC *mg;
if(!SvMAGICAL(sv) || !(mg = mg_find(sv, PERL_MAGIC_vstring)))
return FALSE;

*pvp = mg->mg_ptr;
*lenp = mg->mg_len;
return TRUE;
}
#endif

#ifndef sv_vstring_set
#define sv_vstring_set(sv,pv,len) \
STMT_START { \
sv_magic((sv), NULL, PERL_MAGIC_vstring, (pv), (len)); \
SvRMAGICAL_on(sv); \
} STMT_END
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This belongs in Devel::PPPort

Copy link
Contributor

@Leont Leont Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like an excellent idea that I fully support someone else doing.

Really - I have no idea how to edit or maintain PPPort. The entire process seems utterly opaque to me. If someone else wants to copypaste it over there then I can update this PR to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a more general problem that almost no-one knows how to do that. I suspect that the real solution is that several of us learn to do so because otherwise this keeps being a recurring problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help getting this into D:P

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's fine with me to merge this, and we will release a new D:P later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like overall the process for adding this to D::P is nontrivial enough, that I'd suggest it might be easier to first merge this PR as-is, and then as a separate change, move these support helpers out to there. That way it doesn't hold up this particular change and the work depending on it.

sv.c Outdated
*/

void
Perl_sv_vstring_set(pTHX_ SV * const sv, const char *pv, STRLEN const len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't Perl_sv_vstring_set() sanitizing the user's input args const char *pv, STRLEN len? Why or why not, are both good answers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the question at all.

@bulk88
Copy link
Contributor

bulk88 commented Mar 6, 2025

What happens if arg SV* dest_sv passed by caller into sv_vstring_set() is already IOK_on/NOK_on/POK_on/ROK_on/SvIsCOW_on?

What if PP-level SvIVX() vs SvNVX() vs SvPVX() vs sv_vstring_set(sv,pv,len) are 4 different random [i.e. uninit mem] PP $scalar values? What should sv_vstring_set() do? something? nothing? any answer is a good answer

sv.c Outdated

If the given SV has vstring magic, stores the string pointer and length into
the variables addressed by C<pvp> and C<lenp>, and returns true. If not,
returns false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a sentence saying,

stores the string pointer and length into
the variables addressed by C<pvp> and C<lenp>. The lifespan
of string pointer is until the next FREETMPS;.

or

The lifespan of string pointer is until next LEAVE;.

or

The lifespan of string pointer is of undefined storage and from
undefined allocator.   Memcpy() it ASAP if necessary.  It is not documented
 if string pointer is backed by a Newx()/Safesysmalloc() block,
somewhere into the middle of a Newx() block, or is a SVPV COW,
or is a shared string table HEK or is a temp ptr owned by MORTAL
or SAVESTACK or is a process global ptr to a C const "" string literal.

Version # 3 documents lifetime as TBD/opaque/UB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of those apply; but I've now added some words to point out the string buffer is owned by the SV itself (or at least, the magic annotation on it)

@@ -2597,18 +2620,16 @@ static int store_scalar(pTHX_ stcxt_t *cxt, SV *sv)
string:

#ifdef SvVOK
if (SvMAGICAL(sv) && (mg = mg_find(sv, 'V'))) {
if (sv_vstring_get(sv, &vstr_pv, &vstr_len)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were a serializing XS mod, and performance is important, sv_vstring_get needs to be #define SvVSTRING(sv, pv, n) with a fast shortcut on SvMAGICAL(sv) && before executing "heavy" extern exported PLT/GOT C symbol Perl_sv_vstring_get().


bool
Perl_sv_vstring_get(pTHX_ SV * const sv, const char **pvp, STRLEN *lenp)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is retval of Perl_sv_vstring_get of type bool and not of type STRLEN?

What does if(SvVOK(sv) && SvVCUR(sv) == 0) { NOOP; } mean on a PP/XS/C level?

why not move arg 3 STRLEN *lenp to retval?

Code like if(!len) { die("bad input"); } has universal meaning.

if(1) {
        sv_setpvs(TARG, "0 but true");
        XPUSHTARG;
        RETURN;
    }

is very rare in Perl/all SW dev.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is retval of Perl_sv_vstring_get of type bool and not of type STRLEN?

It matches the similar API shape I added with sv_regex_global_pos_get().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of more use of retvals where no ambiguity is possible. Should sv_regex_global_pos_get() also return its STRLEN?

@leonerd leonerd force-pushed the sv_vstring-API branch 2 times, most recently from 26f92cf to 49d4e7b Compare March 7, 2025 20:14
sv.c Outdated
Comment on lines 17909 to 17915
=for apidoc sv_vstring_set

Applies vstring magic to the given SV, storing the string given by C<pv> for
C<len> bytes into it. If the SV already had vstring magic, the previous
stored string is freed and replaced.

=cut
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably point at upg_version() and new_version(), which is what you want if you're not using a pre-parsed version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this...

Storable uses this mechanism to restore the magic to match an SV it was originally provided, eg. from a vX.Y:

$ perl -MDevel::Peek -e 'Dump(v1.1.2)'
SV = PVMG(0x556625c6bd00) at 0x556625c29fe0
  REFCNT = 1
  FLAGS = (RMG,POK,IsCOW,READONLY,PROTECT,pPOK)
  IV = 0
  NV = 0
  PV = 0x556625c325e0 "\x01\x01\x02"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 0
  MAGIC = 0x556625c531d0
    MG_VIRTUAL = 0
    MG_TYPE = PERL_MAGIC_vstring(V)
    MG_LEN = 6
    MG_PTR = 0x556625c1bb60 "v1.1.2"

Note that the magic matches the PV.

I do think that the docs should indicate this is only useful when the PV and the magic are kept in sync (or someone is going to see unexpected results).

@leonerd
Copy link
Contributor Author

leonerd commented Mar 10, 2025

Having talked with @khwilliamson both on IRC and email, we are progressing with looking at how to add this to Devel::PPPort, but it seems the process is somewhat more involved than anticipated. It would be nice if that delay didn't hold up this PR, because it would ba shame for all API additions to be held hostage by the bus-factor of PPPort.

@Leont
Copy link
Contributor

Leont commented Mar 11, 2025

Having thought more about it, I'm not sure sv_vstring_set should be public API (though it may well be fine as internal API). Nor am I sure it has the right shape in the first place. In its current form, you can easily set a vstring value that doesn't match with its string value, which seems wrong.

@leonerd
Copy link
Contributor Author

leonerd commented Mar 12, 2025

Having thought more about it, I'm not sure sv_vstring_set should be public API (though it may well be fine as internal API). Nor am I sure it has the right shape in the first place. In its current form, you can easily set a vstring value that doesn't match with its string value, which seems wrong.

I'd be perfectly happy saying it shouldn't be public API. Unfortunately, Storable.xs needs to call it. So that's a bit out of the bag.

How about, instead of two separate steps of "make a plain SVt_PV", "add vstring magic to it", there was a single API function on "make a vstring, with these two stringy parts"? Effectively a dualstring var. ;)

Perhaps that's a more robust solution here?

@tonycoz
Copy link
Contributor

tonycoz commented Mar 16, 2025

How about, instead of two separate steps of "make a plain SVt_PV", "add vstring magic to it", there was a single API function on "make a vstring, with these two stringy parts"? Effectively a dualstring var. ;)

Are you looking at some particular use-case here?

If you have a x.y.z version then you can use new_version() to make a version SV out of it.

The only use I can see for this is for the deep serialisation that Storable does, Sereal, which I believe tries to be reasonably deep doesn't bother restoring version magic.

@Leont
Copy link
Contributor

Leont commented Mar 17, 2025

The only use I can see for this is for the deep serialisation that Storable does, Sereal, which I believe tries to be reasonably deep doesn't bother restoring version magic.

Exactly.

@leonerd
Copy link
Contributor Author

leonerd commented Mar 17, 2025

Honestly, the only "use case" is to get rid of as many lines of code that refer to PERL_MAGIC_vstring as possible. I'm looking for any way to achieve that. Storable.xs has an unfortunate rare cornercase in needing to create these, as well as inspect existing ones, so that makes it a bit difficult to just get rid of, without providing some kind of externally-visible API to let it do what it needs.

@leonerd leonerd changed the title Create abstraction API around PERL_MAGIC_vstring Add sv_vstring_get API around PERL_MAGIC_vstring Mar 17, 2025
@leonerd
Copy link
Contributor Author

leonerd commented Mar 17, 2025

As the get part of the API seemed relatively uncontentious, I have removed the set part for now to just this. Hopefully we can at least put this part in, and come back to look at the create or set side another time.

@leonerd leonerd added the squash-before-merge Author must squash the commits down before merging to blead label Mar 17, 2025
@leonerd
Copy link
Contributor Author

leonerd commented Mar 18, 2025

Actually now I look at this API, I wonder if it should be shaped to be similar to SvPV, as in:

STRLEN vstr_len;
const char *vstr_pv = SvVSTRING(sv, vstr_len);

@Leont
Copy link
Contributor

Leont commented Mar 18, 2025

Actually now I look at this API, I wonder if it should be shaped to be similar to SvPV, as in:

That is also what I would have done.

@leonerd
Copy link
Contributor Author

leonerd commented Mar 18, 2025

Reviewers: PTAL. I've now added this as a SvVSTRING macro, and removed the set function for now. Hopefully this can go in in time for 5.41.10.

This new function and wrapper macro mean that caller code does not have
to directly rely on (or be aware of) the `PERL_MAGIC_vstring` type. The
intent of the API now works independently of the current implemention as
magic.
@leonerd leonerd merged commit 80f24f4 into Perl:blead Mar 18, 2025
33 checks passed
@leonerd leonerd deleted the sv_vstring-API branch March 18, 2025 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
squash-before-merge Author must squash the commits down before merging to blead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants