Skip to content

Commit 2428e7d

Browse files
authored
Merge pull request #26 from claromes/1.0a7
v1.0a7
2 parents c367525 + 7aa3972 commit 2428e7d

File tree

12 files changed

+100
-43
lines changed

12 files changed

+100
-43
lines changed

CITATION.cff

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ authors:
1313
1414
identifiers:
1515
- type: doi
16-
value: 10.5281/zenodo.12528448
16+
value: 10.5281/zenodo.12528447
1717
description: The concept DOI of the work.
1818
- type: url
1919
value: "https://pypi.org/project/waybacktweets/"

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
# Wayback Tweets
22

3-
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12528448.svg)](https://doi.org/10.5281/zenodo.12528448) [![PyPI](https://img.shields.io/pypi/v/waybacktweets)](https://pypi.org/project/waybacktweets) [![docs](https://github.com/claromes/waybacktweets/actions/workflows/docs.yml/badge.svg)](https://github.com/claromes/waybacktweets/actions/workflows/docs.yml) [![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://waybacktweets.streamlit.app)
3+
[![PyPI](https://img.shields.io/pypi/v/waybacktweets)](https://pypi.org/project/waybacktweets) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12528447.svg)](https://doi.org/10.5281/zenodo.12528447) [![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://waybacktweets.streamlit.app) [![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zRqi6uTMiGi5z8GQ-PC0tbpCJWULCqMO?usp=sharing)
44

5-
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see [Field Options](https://claromes.github.io/waybacktweets/field_options.html)), and saves the data in HTML (for easy viewing of the tweets using the `iframe` tag), CSV, and JSON formats.
5+
6+
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see [Field Options](https://claromes.github.io/waybacktweets/field_options.html)), and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.
67

78
## Installation
89

@@ -57,7 +58,7 @@ if archived_tweets:
5758
## Acknowledgements
5859

5960
- Tristan Lee (Bellingcat's Data Scientist) for the idea of the application.
60-
- Jessica Smith (Snowflake's Marketing Specialist) and Streamlit/Snowflake teams for the additional server resources on Streamlit Cloud.
61+
- Jessica Smith (Snowflake's Community Growth Specialist) and Streamlit/Snowflake team for the additional server resources on Streamlit Cloud.
6162
- OSINT Community for recommending the application.
6263

6364
> [!NOTE]

app/app.py

Lines changed: 11 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
layout="centered",
3535
menu_items={
3636
"About": f"""
37-
[![GitHub release (latest by date including pre-releases)](https://img.shields.io/github/v/release/claromes/waybacktweets?include_prereleases)](https://github.com/claromes/waybacktweets/releases) [![License](https://img.shields.io/github/license/claromes/waybacktweets)](https://github.com/claromes/waybacktweets/blob/main/LICENSE.md) [![Star](https://img.shields.io/github/stars/claromes/waybacktweets?style=social)](https://github.com/claromes/waybacktweets)
37+
[![License](https://img.shields.io/github/license/claromes/waybacktweets)](https://github.com/claromes/waybacktweets/blob/main/LICENSE.md)
3838
3939
The application is a prototype hosted on Streamlit Cloud, serving as an alternative to the command line tool.
4040
@@ -168,16 +168,12 @@ def scroll_page():
168168

169169
# ------ User Interface Settings ------ #
170170

171-
st.info(
172-
"🥳 [**Pre-release 1.0x: Python module, CLI, and new Streamlit app**](https://github.com/claromes/waybacktweets/releases)" # noqa: E501
173-
)
174-
175171
st.image(TITLE, use_column_width="never")
176172
st.caption(
177-
"[![GitHub release (latest by date including pre-releases)](https://img.shields.io/github/v/release/claromes/waybacktweets?include_prereleases)](https://github.com/claromes/waybacktweets/releases) [![Star](https://img.shields.io/github/stars/claromes/waybacktweets?style=social)](https://github.com/claromes/waybacktweets)" # noqa: E501
173+
"[![GitHub release (latest by date including pre-releases)](https://img.shields.io/github/v/release/claromes/waybacktweets?include_prereleases)](https://github.com/claromes/waybacktweets/releases) [![sponsor](https://img.shields.io/badge/Donate-via%20Sponsors-ff69b4.svg?logo=github)](https://github.com/sponsors/claromes)" # noqa: E501
178174
)
179175
st.write(
180-
"Retrieves archived tweets CDX data in HTML (for easy viewing of the tweets using the `iframe` tag), CSV, and JSON formats." # noqa: E501
176+
"Retrieves archived tweets CDX data in HTML (for easy viewing of the tweets using the iframe tag), CSV, and JSON formats." # noqa: E501
181177
)
182178

183179
st.write(
@@ -291,15 +287,15 @@ def scroll_page():
291287

292288
# -- Rendering -- #
293289

294-
if csv_data and json_data and html_content:
295-
st.session_state.count = len(df)
296-
st.write(f"**{st.session_state.count} URLs have been captured**")
290+
st.session_state.count = len(df)
291+
st.write(f"**{st.session_state.count} URLs have been captured**")
297292

298-
# -- HTML -- #
293+
tab1, tab2, tab3 = st.tabs(["HTML", "CSV", "JSON"])
299294

300-
st.header("HTML", divider="gray", anchor=False)
295+
# -- HTML -- #
296+
with tab1:
301297
st.write(
302-
f"Visualize tweets more efficiently through `iframes`. Download the @{st.session_state.current_username}'s archived tweets in HTML." # noqa: E501
298+
f"Visualize tweets more efficiently through iframe tags. Download the @{st.session_state.current_username}'s archived tweets in HTML." # noqa: E501
303299
)
304300

305301
col5, col6 = st.columns([1, 18])
@@ -317,8 +313,7 @@ def scroll_page():
317313
)
318314

319315
# -- CSV -- #
320-
321-
st.header("CSV", divider="gray", anchor=False)
316+
with tab2:
322317
st.write(
323318
"Check the data returned in the dataframe below and download the file."
324319
)
@@ -340,8 +335,7 @@ def scroll_page():
340335
st.dataframe(df, use_container_width=True)
341336

342337
# -- JSON -- #
343-
344-
st.header("JSON", divider="gray", anchor=False)
338+
with tab3:
345339
st.write(
346340
"Check the data returned in JSON format below and download the file."
347341
)

docs/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
"sphinx_new_tab_link",
2121
"sphinx_click.ext",
2222
"sphinx_autodoc_typehints",
23+
"sphinxcontrib.youtube",
2324
]
2425

2526
templates_path = ["_templates"]

docs/contribute.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ These are the prerequisites:
1919
- Python 3.10+
2020
- Poetry
2121

22-
Install from the source, following the :ref:`installation` instructions.
22+
Install from the source, following the :ref:`installation_from_source` instructions.
2323

2424
Brief explanation about the code under the Wayback Tweets directory:
2525

docs/handson.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
Hands-On Examples
2+
====================
3+
4+
- **Notebook**
5+
6+
This notebook demonstrates how to fetch, parse, and export archived tweets for a specific user using the ``waybacktweets`` library.
7+
8+
.. image:: https://colab.research.google.com/assets/colab-badge.svg
9+
:target: https://colab.research.google.com/drive/1zRqi6uTMiGi5z8GQ-PC0tbpCJWULCqMO?usp=sharing
10+
:alt: Open In Collab
11+
12+
.. raw:: html
13+
14+
<br>
15+
<br>
16+
17+
- **Video**
18+
19+
Demonstration of how to use Wayback Tweets and other tools to retrieve tweets (in Spanish)
20+
21+
.. youtube:: qy3wOnUxe6A
22+
:width: 100%

docs/index.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,11 @@ Wayback Tweets
99

1010
Pre-release: |release|
1111

12-
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see :ref:`field_options`), and saves the data in HTML (for easy viewing of the tweets using the ``iframe`` tag), CSV, and JSON formats.
12+
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see :ref:`field_options`), and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.
1313

14-
.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.12528448.svg
15-
:target: https://doi.org/10.5281/zenodo.12528448
14+
.. image:: https://img.shields.io/badge/Donate-via%20Sponsors-ff69b4.svg?logo=github
15+
:target: https://github.com/sponsors/claromes
16+
:alt: GitHub Sponsors
1617

1718
.. note::
1819
Intensive queries can lead to rate limiting, resulting in a temporary ban of a few minutes from web.archive.org.
@@ -30,6 +31,7 @@ User Guide
3031
field_options
3132
outputs
3233
exceptions
34+
handson
3335
contribute
3436
todo
3537

docs/installation.rst

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
1-
.. _installation:
2-
31
Installation
42
================
53

4+
**It is compatible with Python versions 3.10 and above.**
65

76
Using pip
87
------------
@@ -11,47 +10,68 @@ Using pip
1110
1211
pip install waybacktweets
1312
13+
Using Poetry
14+
------------
15+
16+
.. code-block:: shell
17+
18+
poetry add waybacktweets
19+
20+
.. _installation_from_source:
21+
1422
From source
1523
-------------
1624

17-
Clone the repository:
25+
**Clone the repository:**
1826

1927
.. code-block:: shell
2028
2129
git clone [email protected]:claromes/waybacktweets.git
2230
23-
Change directory:
31+
**Change directory:**
2432

2533
.. code-block:: shell
2634
2735
cd waybacktweets
2836
29-
Install poetry, if you haven't already:
37+
**Install Poetry, if you haven't already:**
3038

3139
.. code-block:: shell
3240
3341
pip install poetry
3442
3543
36-
Install the dependencies:
44+
**Install the dependencies:**
3745

3846
.. code-block:: shell
3947
4048
poetry install
4149
42-
Run the CLI:
50+
**Install the pre-commit:**
51+
52+
.. code-block:: shell
53+
54+
poetry run pre-commit install
55+
56+
**Run the CLI:**
4357

4458
.. code-block:: shell
4559
4660
poetry run waybacktweets [SUBCOMMANDS]
4761
48-
Run the Streamlit App:
62+
**Starts a new shell and activates the virtual environment:**
63+
64+
.. code-block:: shell
65+
66+
poetry shell
67+
68+
**Run the Streamlit App:**
4969

5070
.. code-block:: shell
5171
5272
streamlit run app/app.py
5373
54-
Build the docs:
74+
**Build the docs:**
5575

5676
.. code-block:: shell
5777

legacy_app/legacy_app.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,7 @@
1414
layout="centered",
1515
menu_items={
1616
"About": """
17-
## 🏛️ Wayback Tweets
18-
19-
Tool that displays, via Wayback CDX Server API, multiple archived tweets on Wayback Machine to avoid opening each link manually. Users can apply filters based on specific years and view tweets that do not have the original URL available.
20-
21-
This tool is a prototype, please feel free to send your [feedbacks](https://github.com/claromes/waybacktweets/issues). Created by [@claromes](https://claromes.com).
17+
This is the legacy application of [Wayback Tweets](https://waybacktweets.streamlit.app/).
2218
2319
-------
2420
""", # noqa: E501
@@ -386,7 +382,7 @@ def next_page():
386382

387383
# UI
388384
st.title(
389-
"Wayback Tweets [![Star](https://img.shields.io/github/stars/claromes/waybacktweets?style=social)](https://github.com/claromes/waybacktweets)", # noqa: E501
385+
"Wayback Tweets", # noqa: E501
390386
anchor=False,
391387
help="v0.4.3",
392388
)

poetry.lock

Lines changed: 21 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)