Skip to content

Commit 9cad744

Browse files
authored
Merge pull request #12 from jmenglund/develop
Develop
2 parents cfdd46e + 9243ce8 commit 9cad744

File tree

9 files changed

+188
-74
lines changed

9 files changed

+188
-74
lines changed

.travis.yml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,19 @@ branches:
1010
only:
1111
- master
1212

13+
install:
14+
- pip install .
15+
- pip install pycodestyle
16+
- pip install pytest
17+
- pip install coverage
18+
- pip install codecov
19+
- pip install dendropy
20+
- pip install biopython
21+
1322
script:
14-
- py.test pandascharm.py --pep8
15-
- coverage run -m py.test
16-
- coverage report --include pandascharm.py -m
23+
- pycodestyle pandascharm.py test_pandascharm.py setup.py
24+
- coverage run -m pytest test_pandascharm.py
25+
- coverage report -m pandascharm.py
1726

1827
after_success:
1928
- codecov

CHANGELOG.rst

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,19 @@
11
Changelog
22
=========
33

4+
0.2.0
5+
-----
6+
7+
* Added functions ``from_dict`` and ``to_dict`` for casting from and to dictionaries.
8+
* Updated commands for Travis-CI.
9+
* Updates to ``README.rst``.
10+
* Updates to ``release-checklist.rst``.
11+
12+
Release date: 2019-02-23
13+
14+
`View commits <https://github.com/jmenglund/pandas-charm/compare/v0.1.3...v0.2.0>`_
15+
16+
417
0.1.3
518
-----
619

@@ -11,24 +24,25 @@ Changelog
1124
* Added Travis-CI testing for Python 3.6.
1225
* List of included categories are now ignored in matrix conversions involving
1326
categorical data.
14-
* ``pytest-cov`` removed from *requirements.txt*.
15-
* Updates to *setup.py*.
16-
* Updates to *README.rst*.
17-
* Updates to *release-checklist.rst*.
27+
* ``pytest-cov`` removed from ``requirements.txt``.
28+
* Updates to ``setup.py``.
29+
* Updates to ``README.rst``.
30+
* Updates to ``release-checklist.rst``.
1831

1932
Release date: 2017-08-25
2033

2134
`View commits <https://github.com/jmenglund/pandas-charm/compare/v0.1.2...v0.1.3>`_
2235

36+
2337
0.1.2
2438
-----
2539

2640
* Added Python versions for Travis-CI (3.3, 3.5)
27-
* Added ``pep8`` check to Travis-CI
28-
* Updates to *README.rst*
41+
* Added PEP8 check to Travis-CI
42+
* Updates to ``README.rst``
2943
- Fixed issue with one example not working (``pc.to_charmatrix()``)
3044
- Updated text in various places
31-
* Updates to *release-checklist.rst*
45+
* Updates to ``release-checklist.rst``
3246

3347
Release date: 2016-08-08
3448

@@ -39,8 +53,8 @@ Release date: 2016-08-08
3953
-----
4054

4155
* Simplified builds with Travis-CI.
42-
* DOI badge added to the top of *README.rst*.
43-
* Information on how to cite ``pandas-charm`` added to *README.rst*.
56+
* DOI badge added to the top of ``README.rst``.
57+
* Information on how to cite pandas-charm added to ``README.rst``.
4458

4559
Release date: 2016-07-05
4660

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
include *.rst
2+
include CHANGELOG.md
23
include LICENSE.txt

README.rst

Lines changed: 91 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -3,17 +3,16 @@ pandas-charm
33

44
|Build-Status| |Coverage-Status| |PyPI-Status| |License| |DOI-URI|
55

6-
``pandas-charm`` is a small Python package for getting character
6+
pandas-charm is a small Python package for getting character
77
matrices (alignments) into and out of `pandas <http://pandas.pydata.org>`_.
8-
Its purpose is to make pandas interoperable with other scientific
9-
packages that can be used for dealing with character matrices, like for example
10-
`BioPython <http://biopython.org>`_ and `Dendropy <http://dendropy.org>`_.
8+
Use this library to make pandas interoperable with
9+
`BioPython <http://biopython.org>`_ and `DendroPy <http://dendropy.org>`_.
1110

12-
With ``pandas-charm``, it is currently possible to convert between the
13-
following objects:
11+
Convert between the following objects:
1412

1513
* BioPython MultipleSeqAlignment <-> pandas DataFrame
1614
* DendroPy CharacterMatrix <-> pandas DataFrame
15+
* Python dictionary <-> pandas DataFrame
1716

1817
The code has been tested with Python 2.7, 3.5 and 3.6.
1918

@@ -29,14 +28,14 @@ Source repository: `<https://github.com/jmenglund/pandas-charm>`_
2928
Installation
3029
------------
3130

32-
For most users, the easiest way is probably to install the latest version
33-
hosted on `PyPI <https://pypi.python.org/>`_:
31+
For most users, the easiest way is probably to install the latest version
32+
hosted on `PyPI <https://pypi.org/>`_:
3433

3534
.. code-block::
3635
3736
$ pip install pandas-charm
3837
39-
The project is hosted at https://github.com/jmenglund/pandas-charm and
38+
The project is hosted at https://github.com/jmenglund/pandas-charm and
4039
can also be installed using git:
4140

4241
.. code-block::
@@ -46,33 +45,45 @@ can also be installed using git:
4645
$ python setup.py install
4746
4847
49-
You may consider installing ``pandas-charm`` and its required Python packages
50-
within a virtual environment in order to avoid cluttering your system's
51-
Python path. See for example the environment management system
52-
`conda <http://conda.pydata.org>`_ or the package
48+
You may consider installing pandas-charm and its required Python packages
49+
within a virtual environment in order to avoid cluttering your system's
50+
Python path. See for example the environment management system
51+
`conda <http://conda.pydata.org>`_ or the package
5352
`virtualenv <https://virtualenv.pypa.io/en/latest/>`_.
5453

5554

56-
Running tests
57-
-------------
55+
Running the tests
56+
-----------------
5857

59-
Testing is carried out with `pytest <http://pytest.org>`_. The following
60-
example shows how you can run the test suite and generate a coverage report:
58+
Testing is carried out with `pytest <https://docs.pytest.org/>`_:
6159

6260
.. code-block::
6361
64-
$ pip install pytest pytest-pep8 dendropy biopython
65-
$ py.test -v --pep8
66-
$ coverage run -m py.test
67-
$ coverage report --include pandascharm.py
62+
$ pytest -v test_pandascharm.py
63+
64+
Test coverage can be calculated with `Coverage.py
65+
<https://coverage.readthedocs.io/>`_ using the following commands:
66+
67+
.. code-block::
68+
69+
$ coverage run -m pytest
70+
$ coverage report -m pandascharm.py
71+
72+
The code follow style conventions in `PEP8
73+
<https://www.python.org/dev/peps/pep-0008/>`_, which can be checked
74+
with `pycodestyle <http://pycodestyle.pycqa.org>`_:
75+
76+
.. code-block::
77+
78+
$ pycodestyle pandascharm.py test_pandascharm.py setup.py
6879
6980
7081
Usage
7182
-----
7283

73-
Below are a few examples on how to use pandas-charm. The examples are
74-
written with Python 3 code, but ``pandas-charm`` should work also with
75-
Python 2.7. You need to install BioPython and/or DendroPy manually
84+
The following examples show how to use pandas-charm. The examples are
85+
written with Python 3 code, but pandas-charm should work also with
86+
Python 2.7+. You need to install BioPython and/or DendroPy manually
7687
before you start:
7788

7889
.. code-block::
@@ -95,9 +106,9 @@ DendroPy CharacterMatrix to pandas DataFrame
95106
t1 TCCAA
96107
t2 TGCAA
97108
t3 TG-AA
98-
99-
>>> matrix = dendropy.DnaCharacterMatrix.get_from_string(
100-
... dna_string, schema='phylip')
109+
110+
>>> matrix = dendropy.DnaCharacterMatrix.get(
111+
... data=dna_string, schema='phylip')
101112
>>> df = pc.from_charmatrix(matrix)
102113
>>> df
103114
t1 t2 t3
@@ -107,8 +118,8 @@ DendroPy CharacterMatrix to pandas DataFrame
107118
3 A A A
108119
4 A A A
109120
110-
By default, characters are stored as rows and sequences as columns
111-
in the DataFrame. If you want rows to hold sequences, just transpose
121+
By default, characters are stored as rows and sequences as columns
122+
in the DataFrame. If you want rows to hold sequences, just transpose
112123
the matrix in pandas:
113124

114125
.. code-block:: pycon
@@ -139,7 +150,7 @@ pandas DataFrame to Dendropy CharacterMatrix
139150
2 C C -
140151
3 A A A
141152
4 A A A
142-
153+
143154
>>> matrix = pc.to_charmatrix(df, data_type='dna')
144155
>>> print(matrix.as_string('phylip'))
145156
3 5
@@ -194,35 +205,70 @@ pandas DataFrame to BioPython MultipleSeqAlignment
194205
2 C C -
195206
3 A A A
196207
4 A A A
197-
208+
198209
>>> alignment = pc.to_bioalignment(df, alphabet='generic_dna')
199210
>>> print(alignment)
200211
SingleLetterAlphabet() alignment with 3 rows and 5 columns
201212
TCCAA t1
202213
TGCAA t2
203214
TG-AA t3
204-
215+
216+
217+
Python dictionary to pandas DataFrame
218+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219+
220+
.. code-block:: pycon
221+
222+
>>> import pandas as pd
223+
>>> import pandascharm as pc
224+
>>> d = {
225+
... 't1': 'TCCAA',
226+
... 't2': 'TGCAA',
227+
... 't3': 'TG-AA'
228+
... }
229+
>>> df = pc.from_dict(d)
230+
>>> df
231+
t1 t2 t3
232+
0 T T T
233+
1 C G G
234+
2 C C -
235+
3 A A A
236+
4 A A A
237+
238+
239+
pandas DataFrame to Python dictionary
240+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
241+
242+
.. code-block:: pycon
243+
244+
>>> import pandas as pd
245+
>>> import pandascharm as pc
246+
>>> df = pd.DataFrame({
247+
... 't1': ['T', 'C', 'C', 'A', 'A'],
248+
... 't2': ['T', 'G', 'C', 'A', 'A'],
249+
... 't3': ['T', 'G', '-', 'A', 'A']})
250+
>>> pc.to_dict(df)
251+
{'t1': 'TCCAA', 't2': 'TGCAA', 't3': 'TG-AA'}
205252
206253
207254
The name
208255
--------
209256

210-
``pandas-charm`` got its name from the pandas library plus an acronym for
257+
pandas-charm got its name from the pandas library plus an acronym for
211258
CHARacter Matrix.
212259

213260

214261
License
215262
-------
216263

217-
``pandas-charm`` is distributed under the
218-
`MIT license <https://opensource.org/licenses/MIT>`_.
264+
pandas-charm is distributed under the `MIT license <https://opensource.org/licenses/MIT>`_.
219265

220266

221267
Citing
222268
------
223269

224-
If you use results produced with this package in a scientific
225-
publication, please just mention the package name in the text and
270+
If you use results produced with this package in a scientific
271+
publication, please just mention the package name in the text and
226272
cite the Zenodo DOI of this project:
227273

228274
|DOI-URI|
@@ -236,13 +282,19 @@ Author
236282

237283
Markus Englund, `orcid.org/0000-0003-1688-7112 <http://orcid.org/0000-0003-1688-7112>`_
238284

285+
239286
.. |Build-Status| image:: https://travis-ci.org/jmenglund/pandas-charm.svg?branch=master
240287
:target: https://travis-ci.org/jmenglund/pandas-charm
288+
:alt: Build status
241289
.. |Coverage-Status| image:: https://codecov.io/gh/jmenglund/pandas-charm/branch/master/graph/badge.svg
242290
:target: https://codecov.io/gh/jmenglund/pandas-charm
291+
:alt: Coverage status
243292
.. |PyPI-Status| image:: https://img.shields.io/pypi/v/pandas-charm.svg
244293
:target: https://pypi.python.org/pypi/pandas-charm
294+
:alt: PyPI status
245295
.. |License| image:: https://img.shields.io/pypi/l/pandas-charm.svg
246296
:target: https://raw.githubusercontent.com/jmenglund/pandas-charm/master/LICENSE.txt
247-
.. |DOI-URI| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.848750.svg
248-
:target: https://doi.org/10.5281/zenodo.848750
297+
:alt: License
298+
.. |DOI-URI| image:: https://zenodo.org/badge/62513333.svg
299+
:target: https://zenodo.org/badge/latestdoi/62513333
300+
:alt: DOI

pandascharm.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
__author__ = 'Markus Englund'
88
__license__ = 'MIT'
9-
__version__ = '0.1.3'
9+
__version__ = '0.2.0'
1010

1111

1212
def frame_as_categorical(frame, include_categories=None):
@@ -83,6 +83,15 @@ def from_charmatrix(charmatrix, categorical=True):
8383
return new_frame
8484

8585

86+
def from_dict(d, categorical=True):
87+
d_seq_list = {k: list(v) for (k, v) in d.items()}
88+
frame = pandas.DataFrame(d_seq_list)
89+
if categorical:
90+
return frame_as_categorical(frame)
91+
else:
92+
return frame
93+
94+
8695
def to_bioalignment(frame, alphabet='generic_alphabet'):
8796
"""
8897
Convert a pandas DataFrame to a BioPython MultipleSeqAlignment.
@@ -150,3 +159,7 @@ def to_charmatrix(frame, data_type):
150159
taxon_names = list(frame.columns)
151160
charmatrix.taxon_namespace.sort(key=lambda x: taxon_names.index(x.label))
152161
return charmatrix
162+
163+
164+
def to_dict(frame, into=dict):
165+
return frame.apply(lambda x: ''.join(x)).to_dict(into=into)

0 commit comments

Comments
 (0)