Updates OpenWPM to Firefox 100
Hotfix for the Dockerfile Thanks to @sal123-droid for the report
Updates OpenWPM to Firefox 98
- Reverted workaround of manually creating user.js (#968) Thanks @boolean5!
- Removed invalid resource type (#966)
Updates OpenWPM to Firefox 95
Updates OpenWPM to Firefox 90.0.2
- Moved Extension folder to top-level (#939)
- Replaced TSLint with ESLint (#940)
- Stopped AssertionErrors crashing production crawls (#945)
Updates OpenWPM to Firefox 89
This release didn't introduce any new functionality.
Things that happened in the last release:
- Enabled extension tests (#929)
- Fixed deault extension config (#927)
- Expanded profile tests (#924)
Updates OpenWPM to Firefox 88
This release reenables the support for stateful crawling
If you are unsure what this means please have a look at our documentation on ReadTheDocs
Other things that happened since last release:
- Restored Docker build (#871) - All of our releases are now available on Dockerhub again
- OpenWPM now monitors speculative connections (#872)
- We introduced sphinx and publish our documentation to RTD (#863, #894 and #900)
- Combined log_directory and log_file to log_path (#911)
- Fixed a bug in the socket code of the WebExtension
Thanks @shashigharti for the excellent bug report! (#912) - Fixed data loss issue reported by @bkrumnow (#902)
Firefox 86.0.1
This release is defined by a major push towards a more typed and easier to extend OpenWPM
Here are the highlights:
-
Refactored BrowserParams and ManagerParams into Dataclasses (#807)
By @Ankushduacodes
This PR replaced the manager_params and browserdict
s with objects allowing for property access and type annotations. This change makes it a lot harder to use invalid configuration options. -
Turned Commands into Objects inheriting from a BaseCommand (#750)
This change allows users to easily write their own commands without having to touch OpenWPM internals. It restructured a lot of internal code to bundle the definition of a command with its implementation -
Data Aggregator Rewrite (#753)
This PR rewrote the way we do data saving from the ground up to enable a more modular and scalable design. Users are now empowered to provide their own storage backend if need be and can easily pick and choose between provided implementations.
Additional changes:
- #844 Switched to Github Actions
- #854 Fix behavior of failure_limit (by @boolean5)
Thanks to all the external contributors that worked on this release. Besides the people mentioned above we also merged contributions from:
- #803 Removed unneccessary List unpacking
- #804 Updated stored commands in test_webdriver_utils.py
- #806 Module & Imports conformed to PEP8
- #811 Changed imports to avoid errors
- #815 Fixed nodejs version typo
- #822 Removed references and leftovers from #807
- #841 Updated webdriver syntax
- #817 Fixed
manual_test.py
- #844 Pyvirtualdisplay update
Firefox 83 release
There has been a lot happening in OpenWPM over the last three months.
Here are the highlights:
-
Introduced
seed_profile
(#735)
As part of our long-standing effort to restore stateful crawling support we now allow specifying aseed_profile
that gets loaded for each fresh start of a browser.
More documentation can be found here. -
Added new WebExtension instrument
dns_instrument
(#721)
This instrument was contributed by @turban1988 .
It allows to log the resolution of DNS requests. Unfortunately it is currently undocumented. Adding docs is tracked in #758. -
Moved
automation
toopenwpm
(#793)
This change might have the biggest impact on users as we changed the name of the top-level package. We apologize for the inconvenience caused but felt it was a good change overall as the new name is a lot more meaningful.
Internal changes:
- We replaced flake8 with black as our formatting tool #740
- @Metropass removed the built in extensions as they were horribly out of date #754
- @Ankushduacodes removed the
browser_settings
as they required a lot of code for very little benefit #775 - @Ankushduacodes made the
memory_watchdog
andprocess_watchdog
part of themanager_params
#785 #787
Thanks to all the external contributors that worked on this release. Besides the people mentioned above we also merged contributions from:
Firefox 80.0.0 release
There have been no new features added in this release. However there are two significant bugfixes worth highlighting:
- We hopefully fixed a bug when hashing content where the same file could have multiple hashes If you ran a big crawl and could repeat @birdsarah's analysis, we'd be grateful if you reported your results here
- Fixed longstanding bug when propagating exceptions from the BrowserManager to the TaskManager you should now be seeing the exception that happened in the BrowserManager in your logs
NOTE: Please be aware that this release contains a regression related to https://bugzilla.mozilla.org/show_bug.cgi?id=1656405 and https://bugzilla.mozilla.org/show_bug.cgi?id=1599160. This means some requests with cached responses might not show up as requests or responses in your instrumentation. We assume this will be fixed in FF81.
Firefox 78.0.1 scheduled release. This release contains some minor bug fixes and one new feature: Arbitrary JS Instrumentation.
New features:
-
Arbitrary JS Instrumentation allows users to specify, python side the set APIs they would like to instrument in their crawl. The default remains the set of fingerprinting apis, now called
collection_fingerprinting
. This has meant a number of API changes including the renaming of the browser_paramjs_instrument_modules
tojs_instrument_settings
. Asjs_instrument_modules
was not actually configurable previously, we do not anticipate too much disruption to users. Details of how to configure the newjs_instrument_settings
are in the Instrumentation and Data Access section of the README.
Minor fixes:
- Asserting that unpickled exception is an exception PR 705
- Minor fixes PR 703
- Better crawling experience PR 696
No OpenWPM release was made with Firefox 78.0.
This release is a long overdue release of OpenWPM, and contains too many changes to list here. Instead, we highlight the major architectural since the previous release. The instrumentation has been completely rewritten as part of this release to the new WebExtensions framework. Older versions of OpenWPM should not be used.
Changes:
- Migrate instrumentation from the addon-sdk framework to WebExtensions.
- Migrate to unbranded builds of Firefox, and off of the ESR channel to the Release channel
- Add support for MacOS development
- Add an S3Aggregator that saves data in Parquet format on S3
- Use conda for dependency management
- Disable stateful crawling due to intermittent loss of profiles and geckodriver incompatibilities
- Refactor extension instumentation to live in a separate module
- Re-write logger and add support for logging to sentry
- Add a crawler.py crawl script that can be used for cloud deployments like the type documented in https://github.com/openwpm/OpenWPM-crawler
- Add support for Firefox's native headless mode alongside XVFB
- Add Dockerfile and automatically deploy builds to dockerhub
- Add a Navigation instrument that records navigation events
- Drop support for Python 2
- Remove support for Flash
- Numerous stability and data saving improvements (particularly for cloud crawls / the S3Aggregator)
- Numerous bugfixes + improved testing
A checkpoint release for the final version of OpenWPM to support Firefox 52 and the addon-sdk framework. We recommend against using this release as Firefox 52ESR is no longer receiving security updates.
Changes:
- The
automation
library can now be used with Python 3.4 or later, as well as Python 2.7. - Bump to Firefox 52 ESR, Selenium 3.4.0+, and geckodriver 0.15.0.
- geckodriver is required for Selenium 3+.
install.sh
will download and install it. - geckodriver 0.16.0+ does not support Firefox 52 or lower, so we are stuck with 0.15.0 (and any bugs it may have) until the next ESR release.
- These versions of geckodriver and Selenium require Firefox 48+.
- geckodriver is required for Selenium 3+.
- MITMProxy support has been removed. Use
http_instrument
instead. - Bundled Firefox privacy extensions have been updated.
- AdBlock Plus support has been removed.
- uBlock Origin and Disconnect added.
- Ghostery has been updated.
- Extensions built using the WebExtensions API are now supported. Our extension still uses the add-on sdk.
- Experimental support for saving Parquet files on S3
- Numerous bug fixes
A long overdue version bump to checkpoint the final version to support Selenium 2 + FF 45. Note we recommend against using the release as Firefox 45ESR is no longer receiving security patches.
Changes:
- Add extension-based HTTP instrumentation, including POST body processing
- Deprecate proxy-based HTTP instrumentation
- Save stacktrace of HTTP requests
- Prevent Selenium 2 from self identifying in the DOM
- Add support for blocking commands
- Improve exception handling in child processes
- Refactor of socket interface in extension
- Improvements to manual testing code
- Add a logging module to the extension, logs to central log file
- Instrument
document.cookie
- A number of improvements to the
instrumentObject
instrumentation interface in extension - Make
install.sh
scriptable
Changes:
- Bugfixes to extension instrumentation where records would be dropped when the extension was under heavy load and fail to re-enable until the browser was restarted.
- Bugfix to extension / socket interface
- Add
run_custom_function
command - Using alternative serialization/parallelization with
dill
andmultiprocess
- Better documentation
- Bugfixes to install script
- Add
save_screenshot
anddump_page_source
commands - Add Audio API instrumentation
- Bugfix to
browse
command - Bugfix to extension instrumentation injection to avoid Security Errors
Changes:
- Bugfix to browse command. Now supports sleeping after get.
Critical:
- Bugfix in LevelDBAggregator preventing data loss
Changes:
- Bump to Firefox 45 & Selenium 2.53.0
- Update certificate stored
- Added sleep argument to
get
command - Added install script for development dependencies
- Improved error handling in TaskManager and Proxy
- Version bumps and bugfixes in HTTPS Everywhere, Ghostery, and ABP
- Tests added!
- Numerous bugfixes and improvements in Javascript Instrumentation
Changes:
- Cleanup of Firefox prefs to make browsers faster and reduce phoning home
- Use LevelDB for javascript file storage
- Improved HTTP Cookie Parsing
- Several bugfixes to extension instrumentation
- Improved profile handling during shutdown and crashes
- Improved handling of child Exceptions
- Inital platform tests
- Improvements to javascript instrumentation
Changes:
- Save json serialized headers and fix cookie parsing
Changes:
- Added support for saving all javascript files de-duplicated and compressed
- Created two configuration dictionaries. One for individual browsers and another for the entire infrastructure
- Support for using OpenWPM as a submodule
- Firefox (v39) and Selenium (v2.47.1)
- Added support for launching Ghostery, HTTPS Everywhere, and AdBlock Plus
- Removed Random Extension Support
- Bugfix for broken profile saving.
- Bugfix for profile clearing when memory limits are exceeded
- Numerous stability fixes
- Full Logging support in all commands
Changes:
- Significant stability improvements for long crawls
- Support for logging with logging module
- A large number of bugfixes related to process handling
- Prevention of a large number of stray tmp files/folders during long crawls
- Process/memory watchdog to handle orphaned processes and keep memory usage reasonable
- Numerous bugfixes for extension
- Failure thresholds to prevent infinite loops of browser respawns or command execution attempts (instead, Errors and raised)
- Script to install dependencies
- API changes to command timeouts
- Move SocketInterface from pickle to json serialization
Known Issues:
- Encoding issues cause a very small percentage of data to be dropped by the extension
- Malformed queries are occassionally sent to the DataAggregator and are dropped. The cause is unknown.
- Forking can be done in a more memory efficient way
- 0.3.1 - Fixes #5
- 0.3.0 - Experimental merge of Fourthparty + framework to allow additional javascript instrumentation.
- 0.2.3 - Timeout logging
- 0.2.2 - Browse command + better scrolling + bugfixes
- 0.2.1 - Support for MITMProxy v0.11 + minor bugfixes
- 0.2.0 - Complete re-write of HTTP Cookie parsing
- 0.1.1 - Simplfied load of default settings, including wiki demo
- 0.1.0 - Initial Public Release