Skip to content

Latest commit

 

History

History
66 lines (50 loc) · 3.08 KB

README.md

File metadata and controls

66 lines (50 loc) · 3.08 KB

TrendyPy

PyPI tests Codecov Documentation Status Downloads GitHub last commit Twitter

TrendyPy is a small Python package for sequence clustering. It is initially developed to create time series clusters by calculating trend similarity distance with Dynamic Time Warping.

Installation

You can install TrendyPy with pip.

pip install trendypy

TrendyPy depends on Pandas, Numpy and fastdtw and works in Python 3.7+.

Quickstart

Trendy has scikit-learn like api to allow easy integration to existing programs. Below is a quick example to show how it clusters increasing and decreasing trends.

>>> from trendypy.trendy import Trendy
>>> a = [1, 2, 3, 4, 5] # increasing trend
>>> b = [1, 2.1, 2.9, 4.4, 5.1] # increasing trend
>>> c = [6.2, 5, 4, 3, 2] # decreasing trend
>>> d = [7, 6, 5, 4, 3, 2, 1] # decreasing trend
>>> trendy = Trendy(n_clusters=2)
>>> trendy.fit([a, b, c, d])
>>> print(trendy.labels_)
[0, 0, 1, 1]
>>> trendy.predict([[0.9, 2, 3.1, 4]]) # another increasing trend
[0]

It can also be utilized to cluster strings by using string similarity metrics.

>>> from trendypy.trendy import Trendy
>>> from trendypy.algos import levenshtein_distance
>>> company_names = [
... 	'apple inc', 
... 	'Apple Inc.', 
... 	'Microsoft Corporation', 
... 	'Microsft Corp.']
>>> trendy = Trendy(n_clusters=2, algorithm=levenshtein_distance)
>>> trendy.fit(company_names)
>>> print(trendy.labels_)
[0, 0, 1, 1]
>>> trendy.predict(['Apple'])
[0]

Refer to extensive demo to see it in clustering stock trends, images or to see how to define your own metric or just check API Reference for details.

Post

The idea is originated from the post Trend Clustering.