Listing document titles #79

AstridDBJ · 2024-01-09T15:46:21Z

AstridDBJ
Jan 9, 2024

Hi!

I'm quite new to Python (so this might be an easy fix), but I found LitStudy really interesting to look into.

I have tried to load documents from different files (from different databases) and used litstudy.types.DocumentSet.union to get a DocumentSet without duplicates. However, I would like to know which papers are then in this new collection/dataset. Is it possible to get LitStudy to list (e.g. in a Pandas DataFrame?) the titles of the documents in a specific dataset? Or provide a list/table of the titles just at any stage in the process?

stijnh · 2024-01-09T16:17:12Z

stijnh
Jan 9, 2024
Maintainer

Hi Astrid! Thanks for using litstudy and thanks for reporting this issue!

Unfortunately, at the moment there is no functionality to see which papers were removed when taking the union of multiple document sets.

Issue #68 discussed a similar problem where the is now way to find the papers removed by unique(). An idea there was to add a duplicates() method that returns the papers removed by unique() (such that len(docset) == len(docset.unique()) + len(docset.duplicates()). Something similar could be implemented for union().

We are open to contributes and will accept relevant pull requests that add this functionality.

0 replies

AstridDBJ · 2024-01-10T09:12:49Z

AstridDBJ
Jan 10, 2024
Author

Good to know, thanks! However, I'm actually more interested in the documents that are kept after the union (so not the removed duplicates); e.g. to know which documents I should look into for my review, and thus also the titles of the documents that the different kinds of histograms are based on. Is that possible to do with LitStudy?

0 replies

stijnh · 2024-01-18T21:25:58Z

stijnh
Jan 18, 2024
Maintainer

You can always print the documents like this:

docs_csv = docs_ieee | docs_springer

for doc in docs_csv:
  print(doc.title)

Would that work? Each document has many attribute that you can access (such as the title, authors, publisher, etc.). See here: https://nlesc.github.io/litstudy/api/types.html#litstudy.types.Document

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Listing document titles #79

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Listing document titles #79

AstridDBJ Jan 9, 2024

Replies: 3 comments

stijnh Jan 9, 2024 Maintainer

AstridDBJ Jan 10, 2024 Author

stijnh Jan 18, 2024 Maintainer

AstridDBJ
Jan 9, 2024

stijnh
Jan 9, 2024
Maintainer

AstridDBJ
Jan 10, 2024
Author

stijnh
Jan 18, 2024
Maintainer