You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Retrieves archived tweets CDX data from the Wayback
28
-
Machine, performs necessary parsing, and saves the data in
29
-
HTML (for easy viewing of the tweets using the iframe
30
-
tag), CSV, and JSON formats.
27
+
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing, and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.
[](https://pypi.org/project/waybacktweets)[](https://doi.org/10.5281/zenodo.12528447)[](https://waybacktweets.streamlit.app)[](https://colab.research.google.com/drive/1tnaM3rMWpoSHBZ4P_6iHFPjraWRQ3OGe?usp=sharing)
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see [Field Options](https://claromes.github.io/waybacktweets/field_options.html)), and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.
5
+
Retrieves archived tweets CDX data from the Wayback Machine, performs necessary parsing (see [Field Options](https://waybacktweets.claromes.com/field_options)), and saves the data in HTML, for easy viewing of the tweets using the iframe tags, CSV, and JSON formats.
7
6
8
7
## Installation
9
8
9
+
It is compatible with Python versions 3.10 and above. [See installation options](https://waybacktweets.claromes.com/installation).
10
+
10
11
```shell
11
-
pip install waybacktweets
12
+
pipx install waybacktweets
12
13
```
13
14
14
-
## Quickstart
15
-
16
-
### Using Wayback Tweets as a standalone command line tool
17
-
18
-
waybacktweets [OPTIONS] USERNAME
15
+
## CLI
19
16
20
17
```shell
21
-
waybacktweets --from 20150101 --to 20191231 --limit 250 jack
18
+
Usage:
19
+
waybacktweets [OPTIONS] USERNAME
20
+
USERNAME: The Twitter username without @
21
+
22
+
Options:
23
+
-c, --collapse [urlkey|digest|timestamp:xx]
24
+
Collapse results based on a field, or a
25
+
substring of a field. XX in the timestamp
26
+
value ranges from 1 to 14, comparing the
27
+
first XX digits of the timestamp field. It
28
+
is recommended to use from 4 onwards, to
29
+
compare at least by years.
30
+
-f, --from DATE Filtering by date range from this date.
31
+
Format: YYYYmmdd
32
+
-t, --to DATE Filtering by date range up to this date.
33
+
Format: YYYYmmdd
34
+
-l, --limit INTEGER Query result limits.
35
+
-rk, --resumption_key TEXT Allows for a simple way to scroll through
36
+
the results. Key to continue the query from
37
+
the end of the previous query.
38
+
-mt, --matchtype [exact|prefix|host|domain]
39
+
Results matching a certain prefix, a certain
40
+
host or all subdomains.
41
+
-v, --verbose Shows the log.
42
+
--version Show the version and exit.
43
+
-h, --help Show this message and exit.
44
+
45
+
Examples:
46
+
waybacktweets jack
47
+
waybacktweets --from 20200305 --to 20231231 --limit 300 --verbose jack
48
+
49
+
Repository:
50
+
https://github.com/claromes/waybacktweets
51
+
52
+
Documentation:
53
+
https://waybacktweets.claromes.com
22
54
```
23
55
24
-
### Using Wayback Tweets as a Web App
25
-
26
-
[Open the application](https://waybacktweets.streamlit.app), a prototype written in Python with the Streamlit framework and hosted on Streamlit Cloud.
56
+
## Module
27
57
28
-
### Using Wayback Tweets as a Python Module
58
+
[](https://colab.research.google.com/drive/1tnaM3rMWpoSHBZ4P_6iHFPjraWRQ3OGe?usp=sharing)
29
59
30
60
```python
31
61
from waybacktweets import WaybackTweets, TweetsParser, TweetsExporter
A prototype written in Python with the Streamlit framework and hosted on Streamlit Cloud.
101
+
102
+
Important: Starting from version 1.0, the web app will no longer receive all updates from the official package. To access all features, prefer using the package from PyPI.
0 commit comments