TrendyPy is a small Python package for sequence clustering. It is initially developed to create time series clusters by calculating trend similarity distance with Dynamic Time Warping.
You can install TrendyPy with pip.
pip install trendypy
TrendyPy depends on Pandas, Numpy and fastdtw and works in Python 3.7+.
Trendy has scikit-learn like api to allow easy integration to existing programs. Below is a quick example to show how it clusters increasing and decreasing trends.
>>> from trendypy.trendy import Trendy
>>> a = [1, 2, 3, 4, 5] # increasing trend
>>> b = [1, 2.1, 2.9, 4.4, 5.1] # increasing trend
>>> c = [6.2, 5, 4, 3, 2] # decreasing trend
>>> d = [7, 6, 5, 4, 3, 2, 1] # decreasing trend
>>> trendy = Trendy(n_clusters=2)
>>> trendy.fit([a, b, c, d])
>>> print(trendy.labels_)
[0, 0, 1, 1]
>>> trendy.predict([[0.9, 2, 3.1, 4]]) # another increasing trend
[0]
It can also be utilized to cluster strings by using string similarity metrics.
>>> from trendypy.trendy import Trendy
>>> from trendypy.algos import levenshtein_distance
>>> company_names = [
... 'apple inc',
... 'Apple Inc.',
... 'Microsoft Corporation',
... 'Microsft Corp.']
>>> trendy = Trendy(n_clusters=2, algorithm=levenshtein_distance)
>>> trendy.fit(company_names)
>>> print(trendy.labels_)
[0, 0, 1, 1]
>>> trendy.predict(['Apple'])
[0]
Refer to extensive demo to see it in clustering stock trends, images or to see how to define your own metric or just check API Reference for details.
The idea is originated from the post Trend Clustering.