Skip to content

Commit

Permalink
Merge pull request #70 from lisc-tools/vers
Browse files Browse the repository at this point in the history
[MNT] - Updates for new version
  • Loading branch information
TomDonoghue authored Jun 3, 2021
2 parents 05b553c + d8497d4 commit 77da3b5
Show file tree
Hide file tree
Showing 12 changed files with 121 additions and 26 deletions.
15 changes: 11 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Overview
--------

LISC acts as a wrapper and connector between available APIs, allowing users to collect data from and
about scientific articles, and to do analyses on this data, such as performing automated meta-analyses.
about scientific articles, and perform analyses on this data, such as performing automated meta-analyses.

A curated list of some projects enabled by LISC is available on the `projects <https://github.com/lisc-tools/Projects>`_ page.

Expand All @@ -41,8 +41,8 @@ Supported APIs & Collection Approaches
Supported APIs and data collection approaches include:

- The `EUtils <https://www.ncbi.nlm.nih.gov/books/NBK25497/>`_ API, which provides access to literature data,
including the `Pubmed <https://pubmed.ncbi.nlm.nih.gov/about/>`_ database, from which counts and co-occurences
of terms and/or text and meta-data from identified articles can be collected.
including the `Pubmed <https://pubmed.ncbi.nlm.nih.gov/about/>`_ database, from which text and meta-data from
identified articles can be collected, as well as analyses such as counts and co-occurrences of terms.
- The `OpenCitations <https://opencitations.net>`_ API, which provides access to citation data, from which
citation and reference information can be collected.

Expand All @@ -52,7 +52,7 @@ Analysis & Other Functionality
In addition to connecting to external APIs, LISC also provides:

- A database structure, and save and load utilities for storing collected data
- Custom data objects for managing collected data
- Custom data objects for managing and preprocessing collected data
- Functions and utilities to analyze collected data
- Data visualization functions for plotting collected data and analysis outputs

Expand Down Expand Up @@ -97,6 +97,13 @@ Optional dependencies, used for plotting, analyses & testing:
Install
-------

Stable releases of LISC are released on the Github
`release page <https://github.com/lisc-tools/lisc/releases>`_, and on
`PYPI <https://pypi.org/project/lisc/>`_.

Descriptions of updates and changes across versions are available in the
`changelog <https://lisc-tools.github.io/lisc/changelog.html>`_.

**Stable Release Version**

To install the latest stable release, you can install from pip:
Expand Down
42 changes: 34 additions & 8 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Base Object
Data Objects
------------

Custom objects for storing extracted data.
Custom objects and related functions for storing and managing extracted data.

Term Object
~~~~~~~~~~~
Expand All @@ -68,6 +68,16 @@ Term Object

Term

Metadata Object
~~~~~~~~~~~~~~~

.. currentmodule:: lisc.data

.. autosummary::
:toctree: generated/

MetaData

Articles Objects
~~~~~~~~~~~~~~~~

Expand All @@ -79,29 +89,41 @@ Articles Objects
Articles
ArticlesAll

Metadata Object
~~~~~~~~~~~~~~~
Articles Processing
~~~~~~~~~~~~~~~~~~~

.. currentmodule:: lisc.data
.. currentmodule:: lisc.data.process

.. autosummary::
:toctree: generated/

MetaData
process_articles

Data Collection Functions
-------------------------

Functions for collecting data from supported APIs.

EUtils
~~~~~~

.. currentmodule:: lisc

.. autosummary::
:toctree: generated/

collect_info
collect_counts
collect_words
collect_counts

OpenCitations
~~~~~~~~~~~~~

.. currentmodule:: lisc

.. autosummary::
:toctree: generated/

collect_citations

URLs & Requests Objects
Expand Down Expand Up @@ -200,22 +222,26 @@ Utilities and file management.
File IO
~~~~~~~

.. currentmodule:: lisc.utils
.. currentmodule:: lisc.utils.io

.. autosummary::
:toctree: generated/

save_object
load_object
load_txt_file
load_api_key

Database Management
~~~~~~~~~~~~~~~~~~~

.. currentmodule:: lisc.utils
.. currentmodule:: lisc.utils.db

.. autosummary::
:toctree: generated/

SCDB
create_file_structure
check_file_structure
get_structure_info
check_directory
30 changes: 30 additions & 0 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Code Changelog
==============

This page contains the changelog for the `lisc` module and any notes on updating between versions.

Notes on the specific updates related to each release are also available on the
`release page <https://github.com/lisc-tools/lisc/releases>`_.

Note that between release versions, the general code API should stay consistent, so code from previous releases should generally be compatible with this release. However, internal objects and functions may change, such that saving / loading objects and processing already collected data may be slightly different. It is generally recommended that data be collected and processed within the same version of the module. If you need to load / process data from a different release version, you may need to check if the processing works, and update some things to make it work.

0.2.X
-----

The 0.2.X series is the current release series of the module.

This series is a non-breaking update on the prior release.

The main updates in this update include:
- Internal updates to the LISC objects, and processing (including PRs #36, #39, #50, #60, #67, #68)
- Internal updates to the collection procedures (including PRs #49, #53, #61)
- Updates to available plotting utilities and saving (including PRs #41, #54, #66)
- Extended Pubmed collection to use additional settings, including setting date ranges (including PR #44)
- Add OpenCitations option to collect DOIs of citing papers (including PR #27)
- Miscellaneous bug fixes (including PRs #62, #69)
- General documentation updates (including PRs #30, #31, #38, #43, #45, #46, #64)

0.1.X
-----

The 0.1.X series was the initial release series of the module.
1 change: 1 addition & 0 deletions doc/contents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,6 @@ Table of Contents

api.rst
reference.rst
changelog.rst
auto_tutorials/index.rst
auto_examples/index.rst
5 changes: 3 additions & 2 deletions lisc/objects/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,8 @@ def get_term(self, label):
return term


def add_terms(self, terms, term_type=None, directory=None, append=False, check_consistency=True):
def add_terms(self, terms, term_type=None, directory=None,
append=False, check_consistency=True):
"""Add terms to the object.
Parameters
Expand Down Expand Up @@ -313,7 +314,7 @@ def unload_terms(self, term_type='terms', reset=True, verbose=True):
self.unload_labels(verbose=verbose)

else:
if verbose:
if verbose and flatten(getattr(self, term_type)):
print('Unloading {}.'.format(term_type))
setattr(self, term_type, list())

Expand Down
2 changes: 1 addition & 1 deletion lisc/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Utilities."""

from .db import SCDB, create_file_structure
from .io import save_object, load_object, load_api_key
from .io import save_object, load_object, load_api_key, load_txt_file
2 changes: 1 addition & 1 deletion lisc/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.2.0-dev'
__version__ = '0.2.0'
6 changes: 6 additions & 0 deletions requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,9 @@ sphinx_gallery
sphinx_bootstrap_theme
sphinx-copybutton
numpydoc

# Optional dependencies that are required for building documentation
matplotlib
seaborn
scipy
wordcloud
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Tutorial 03: Words Collection
Tutorial 01: Words Collection
=============================
Collecting literature data, including text and metadata for specified search terms.
Expand All @@ -9,8 +9,7 @@
# Words Analysis
# --------------
#
# Another way to analyze the literature is to collect text and meta-data from
# all articles found for requested search terms.
# The 'Words' approach collects text and meta-data from articles found for requested search terms.
#

###################################################################################################
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Tutorial 04: Words Analysis
Tutorial 02: Words Analysis
===========================
Analyzing collected text data and metadata.
Expand Down Expand Up @@ -85,16 +85,41 @@
#
# The `results` attribute contains a list of :class:`~.Articles` objects, one for each term.
#
# If you run the :meth:`~.Words.process_combined_results` method, then the
# `combined_results` attribute will contain the corresponding list of
# :class:`~.ArticlesAll` objects, also one for each term.
#

###################################################################################################

# Reload the words object, specifying to also reload the article data
words = load_object('tutorial_words', directory=SCDB('lisc_db'), reload_results=True)

###################################################################################################
#
# Note that the reloaded data is the raw data from the data collection.
#
# The :meth:`~.Words.process_articles` method can be used to do some preprocessing on the
# collected data.
#
# By default, the :func:`~.process_articles` function is used to process articles, which
# preprocesses journal and author names, and tokenizes the text data. You can also pass in
# a custom function to apply custom processing to the collected articles data.
#
# Note that some processing steps, like converting to the ArticlesAll representation,
# will automatically apply article preprocessing.
#

###################################################################################################

# Preprocess article data
words.process_articles()

###################################################################################################
#
# We can also aggregate data across articles, just as we did before, directly in the Words object.
#
# If you run the :meth:`~.Words.process_combined_results` method, then the
# `combined_results` attribute will contain the corresponding list of
# :class:`~.ArticlesAll` objects, also one for each term.
#

###################################################################################################

# Process collected data into aggregated data objects
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Tutorial 01: Counts Collection
Tutorial 03: Counts Collection
==============================
Collecting term co-occurrence data from the scientific literature.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
Tutorial 02: Counts Analysis
Tutorial 04: Counts Analysis
============================
Analyzing collected co-occurrence data.
Expand Down

0 comments on commit 77da3b5

Please sign in to comment.