Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add real-life examples of Python RPM #3177

Merged
merged 1 commit into from
Aug 26, 2024

Conversation

FrostyX
Copy link
Member

@FrostyX FrostyX commented Jun 21, 2024

filename = os.path.basename(path)
hdr = ts.hdrFromFdno(fp.fileno())
ts.addInstall(hdr, filename, "i")
ts.run(lambda *args: print(args), {})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never actually installed anything through the Python API in real life, so I am not sure what the ts.run arguments should be. Same for the erasing example above.

Also, this script fails with

# FIXME this snippet fails with
TypeError: 'NoneType' object cannot be interpreted as an integer
FATAL ERROR: python callback ??? failed, aborting!

Can you please help me finish this example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to supply a callback that handles at least file open and close (rpm.RPMCALLBACK_INST_OPEN_FILE and rpm.RPMCALLBACK_INST_CLOSE_FILE), based on the key you supply. It's all ... rather arcane.

See "Callbacks" and the following slides in https://web.archive.org/web/20050320013335/http://www.ukuug.org/events/linux2004/programme/paper-PNasrat-1/rpm-python-slides/frames.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, the callback example actually sets a rather bad example: you don't want to use a header in the key, because it wastes enormous amounts of memory. If you want output a NEVRA in the callback, pass that string instead.

f"Group : {hdr[rpm.RPMTAG_GROUP]}\n"
f"Size : {hdr[rpm.RPMTAG_SIZE]}\n"
f"License : {hdr[rpm.RPMTAG_LICENSE]}\n"
f"Signature : TODO\n"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the correct tag here, please?

Copy link
Member

@pmatilai pmatilai Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer is not a single tag, it's either RPMTAG_RSAHEADER or RPMTAG_DSAHEADER (will cover both original DSA and ECDSA)

FWIW, most people prefer the shorter hdr['license'] syntax, but of course using rpm.RPMTAG_* symbols, errors are more up front.

@FrostyX
Copy link
Member Author

FrostyX commented Jun 21, 2024

All of the proposed examples show how to implement various rpm commands. I don't know whether it would be worth it having them in a separate (sub)directory or not.

ts = rpm.TransactionSet()
mi = ts.dbMatch("name", name)
hdr = next(mi, None)
if hdr:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in a loop just like in your querying-all-installed version - there can always be more than one package with a given name, and callers better handle that correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping over the return value (for hdr in mi:) also has the advantage that you can skip the next() call and the if hdr check; if an empty list is returned it will simply cause the loop the be skipped.

Copy link
Contributor

@ferdnyc ferdnyc Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You can also check for an empty list after the loop, e.g. this will print "No results" only when the list is empty):

mi = ts.dbMatch("name", name)
for hdr in mi:
    ...
if not mi:
    print("No results")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I take that back. Because mi is an iterator, not a list, it will become False when exhausted, so that code will always print "No results" at the end of the loop. My mistake.

ts = rpm.TransactionSet()
mi = ts.dbMatch("name", name)
hdr = next(mi, None)
if hdr:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a loop to handle multiple identically named packages.


ts = rpm.TransactionSet()
mi = ts.dbMatch()
mi.pattern("name", rpm.RPMMIRE_GLOB, "kernel*")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make the package name in these examples something you pass on the command line to make them actually usable on real systems which are unlikely to have "hello" package installed.


ts = rpm.TransactionSet()
mi = ts.dbMatch()
for hdr in mi:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't wrong in such a small example of course, but assigning the match iterator to a variable can leave the database cursor open for much longer than intended. With read-cursors it's not a big deal with in the post BerkeleyDB world, but when you don't actually need the match iterator handle outside the loop (ie the usual case), I'd recommend this style instead:

for hdr in ts.dbMatch(...):
    # do stuff

Copy link
Member

@pmatilai pmatilai Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar issues exist with the 'ts' handle of course - doesn't matter at all in such small examples but people commonly don't realize that the 'ts' handle will also keep the database open for much longer than intended. But, in typical usage the same ts handle is used in multiple operations and in that case it does make sense to assign it to a variable, whereas the match iterator is mostly useless once looped through once.

@pmatilai
Copy link
Member

pmatilai commented Jun 24, 2024

All of the proposed examples show how to implement various rpm commands. I don't know whether it would be worth it having them in a separate (sub)directory or not.

Anything you do with the bindings is going to resemble something rpm does, I don't see these being any different from the pre-existing examples in that, and I doubt we'll have too many examples to reasonably fit into a single directory anytime soon.

One observation here is that these names are awfully verbose in rpm, the land of the terse, and look a bit odd because of that. So maybe some shorter names would do 😄 Something like 'rpm-qa.py' seems like equally telling to me (but maybe not to someone less familiar with rpm, so dunno)

@ffesti ffesti self-assigned this Jul 3, 2024
@ferdnyc
Copy link
Contributor

ferdnyc commented Jul 29, 2024

If "real-world" examples are the goal, mightn't it be better to mine actual real-world code that uses the Python API for examples? DNF comes to mind, as a blindingly-obvious source of existing, working code based on the RPM Python bindings. 😁

(The examples can be simplified/sanitized for clarity, of course, but they'll be far more "real-world" if they're based on... well... real code.)

@pmatilai
Copy link
Member

Dnf is not the best of sources for rpm API usage. It does most of its work on the libsolv level rather than rpm, and the little rpm API usage there is suffers from the yum heritage.

@FrostyX
Copy link
Member Author

FrostyX commented Jul 30, 2024

Thank you all for so many comments. I am sorry for being inactive on this PR for a while. I was partially on vacation and now trying to catch up. I will update the PR soon.

@FrostyX
Copy link
Member Author

FrostyX commented Aug 3, 2024

Once again, sorry for the delay.
@pmatilai I think I addressed all your comments. Can you please take a look again?

If "real-world" examples are the goal, mightn't it be better to mine actual real-world code that uses the Python API for examples?

I am not aware of any reasonably well-known higher-level tools that use RPM Python API to query, install, and erase packages. I know a few minor tools (some of them mine) that sprinkle some RPM Python API usage here and there but nothing that I'd safely recommend as an example.

It may be a good idea to link such tools in the documentation (Let me know your stance on this, I will be submitting a documentation PR once this one gets merged).

@pmatilai already addressed DNF and its libsolv usage but I'd maybe disagree in general.

I went for example Python implementations of commonly used rpm commands because they are quite easy to understand, it is IMHO a reasonable expectation that Python RPM API users know rpm commands, and also devs will likely compare values they get from the API with whatever rpm prints. Or see some rpm output and say to themself "how the heck do I get this value via API?". At least I do that often.

In the documentation, I can use a different terminology than "real-world" if it is deceiving :-)

ts = rpm.TransactionSet()
for name in sys.argv[1:]:
ts.addErase(name)
ts.run(lambda *args: ..., "")
Copy link
Member

@pmatilai pmatilai Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A transaction with nothing but erase elements probably does happen to work without a callback. But one should always supply a callback to the transaction, especially these being "official" examples.

Maybe demonstrate some other callback events for this? Or combine with the install example, because we don't want people to think one can only do one kind of operation in a transaction even if the rpm cli limits you that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But one should always supply a callback to the transaction, especially these being "official" examples.

I agree. And even though the lambda works there, it's quite WTF-looking.
Updated, PTAL

Maybe demonstrate some other callback events for this? Or combine with the install example, because we don't want people to think one can only do one kind of operation in a transaction even if the rpm cli limits you that way.

This is what I don't want to do. My plan was to explain this in the documentation page, that it is possible to combine different operations within one transaction. If you want to, we can add some python example showing it.

But I would like to keep this example as simple as possible, showing just how to do rpm -e hello

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think it gives you a rather strange impression of the API when it's demonstrated as two separate cases with two entirely different callbacks. But okay, that can be resolved with two simple comments, something like below, respectively for erase and install examples:

# erasure-only transactions don't strictly require anything from the callback
# most operations require at least rpm.RPMCALLBACK_INST_OPEN_FILE  and rpm.RPMCALLBACK_INST_CLOSE_FILE to be handled

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put the comment into the rpm-e.py callback

for path in sys.argv[1:]:
with open(path, "r") as fp:
hdr = ts.hdrFromFdno(fp.fileno())
ts.addInstall(hdr, (hdr.nvr, path), "i")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to use "u" in there (making this the equivalent of 'rpm -U' instead). -i is rather special case operation that is best left to people who know to look for it. Sorry for missing this earlier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you. Updated.

@pmatilai
Copy link
Member

Okay fair enough, I think we've reviewed and bikeshedded this enough by now 😄
Thanks for the examples!

@pmatilai pmatilai merged commit e927be7 into rpm-software-management:master Aug 26, 2024
1 check passed
@FrostyX
Copy link
Member Author

FrostyX commented Aug 26, 2024

Thank you very much for the thorough review, I learned a lot of new things through it.
I am planning to follow this PR up with changes to the RPM Language Bindings on https://rpm.org/documentation.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants