changelog.txt

# twitterscraper changelog

# 1.4.0 ( 2019-11-03 )
## Fixed
- PR228: Fixed Typo in Readme
- PR224: Force CSV quoting for all non-numeric values
## Added
- PR213: Added Dockerfile for Docker support
- PR220: Passed timeout value of 60s from method to requests.get()
- PR231: Added a lot of tweet attributes to the output, regarding links, media and replies.
- PR233: Added support for searching for the '&' sign.
## Improved
- PR223: Pretty printing the output which is dumped

# 1.3.1 ( 2019-09-07 )
## Fixed
- Change two uses of f-strings to .format() since f-strings only work well with Python 3.6+

# 1.3.0 ( 2019-09-07 )
## Added
- Added the use of proxies while making an request. 
- PR #204: Added a max timeout to twitterscraper requests which is set to 60s by default. 

# 1.2.1 ( 2019-08-06 )
### Fixed
- PR #208: Fixed a type in a print statement which was breaking down twitterscraper
- Remove the use of fake_useragent library

# 1.2.0 ( 2019-06-22 )
### Added
- PR #186: adds the fields is_retweet, retweeter related information, and timestamp_epochs to the output.
- PR #184: use fake_useragent for generation of random user agent headers. 
- Additionally scraper for 'is_verified' when scraping for user profile pages.

# 1.1.0 ( 2019-06-15 )
### Added
- PR #176: Using billiard library instead of multiprocessing to add the ability to use this library with Celery.

# 1.0.1 ( 2019-06-15 )
### Fixed
- PR #191: wrong argument was used in the method query_tweets_from_user()
- CSV output file has as default ";" as a separator. 
- PR #173: Some small improvements on the profile page scraping.
### Added
- Command line argument -ow / --overwrite to indicate if an existing output file should be overwritten.

# 1.0.0 ( 2019-02-04 )
### Added
- PR #159: scrapes user profile pages for additional information. 
### Fixed:
- Moved example scripts demonstrating use of get_user_info() functionality to examples folder
- removed screenshot demonstrating get_user_info() works
- Added command line argument to main.py which calls get_user_info() for all users in list of scraped tweets.

# 0.9.3 ( 2018-11-04 )
### Fixed
- PR #143: cancels query if end-date is earlier than begin-date. 
- PR #151: returned json_resp['min_position] is parsed in order to quote special characters.
- PR #153: cast Tweets attributes to proper data types (int instead of str)
- Use codecs.open() to write to file. Should fix issues 144 and 147.

# 0.9.0 ( 2018-07-18 )
### Added
- Added -u / --user command line argument which can be used to scrape all 
  tweets from an users profile page.

## 0.8.1 ( 2018-07-18 )
- saving .csv files as an utf-8 encoded file. This fixes https://github.com/taspinar/twitterscraper/issues/138

## 0.8.0 ( 2018-07-17 )
### Fixed
- remove two headers which caused bad fetching results https://github.com/taspinar/twitterscraper/issues/126#issuecomment-405132147
- fix python2 logger bug https://github.com/taspinar/twitterscraper/issues/134 https://github.com/taspinar/twitterscraper/issues/132 https://github.com/taspinar/twitterscraper/issues/127

### Improved
- Use a generator to get tweets, but convert to list in `query_tweets_once`
  - this is useful for low memory applications, like massively parallelizing twitter scraping through AWS Lambda (128MB RAM)
- use single quotes for all strings (it was inconsistent prior)
- pep8 compliance on L28

### Removed
- remove `eliminate_duplicates` dead code

# 0.7.2 ( 2018-07-09 )
### Fixed
- twitterscraper.logging is imported as logger instead of logging in order to
   avoid a module name clash with Python2's logging module.

# 0.7.1 ( 2018-06-12 )
### Improved
- Give access to logger for scripts which import this module. Create the module,
  `logging.py`, which contains the logger used by twitterscraper.

### Removed
- fake_useragent is removed as a dependency, since it has been giving
user-agent headers which keep being blocked by Twitter.

## 0.7.0 ( 2018-05-06 )
### Fixed
- By using linspace() instead of range() to divide the number of days into
  the number of parallel processes, edge cases ( p = 1 ) now also work fine.
  This fixes https://github.com/taspinar/twitterscraper/issues/108.

### Improved
- The default value of begindate is set to 2006-03-21. The previous value (2017-01-01)
  was chosen arbitrarily and leaded to questions why not all tweets were retrieved.
  This fixes https://github.com/taspinar/twitterscraper/issues/88.

### Added
- Users can now save the tweets in a csv-format, with the arguments "-c" or "--csv"

## 0.6.2 ( 2018-03-21 )
### Fixed
- Errors occuring during the serialization of a non-html response (everything after 1st request),
  No longer crashes the program but is catched with a try / except.
- Fixes https://github.com/taspinar/twitterscraper/issues/93

- The '@' character in an username is now removed by the ".strip('\@')" method instead of "[1:]".
- This fixes issue https://github.com/taspinar/twitterscraper/issues/105

## 0.6.1 ( 2018-03-17 )
### Improved
- The way the number of days are divided over the number of parallel processes is improved.
- The maximum number of parallel processes is limited to the max no of days.
- Fixes https://github.com/taspinar/twitterscraper/issues/101

## 0.6.0 ( 2018-02-17 )
### Fixed
- PR #89: closed pools to prevent zombie processes.


## 0.5.1 ( 2018-02-17 )
### Fixed
- Fixed MaxRecursionError crashes which was introduced with version 0.5.0

## 0.5.0 ( 2018-01-11 )
### Added
- Added the html code of a tweet message to the Tweet class as one of its attributes

## 0.4.2 ( 2018-01-09 )
### Fixed
- Fixed backward compatability of the new --lang parameter by placing it at the end of all arguments.

## 0.4.1 ( 2018-01-07 )
### Fixed
- Fixed --lang functionality by passing the lang parameter from its CL argument form to the generater url.

## 0.4 ( 2017-12-19 )
-----------
### Added
- Added "-bd / --begindate" command line arguments to set the begin date of the query
- Added "-ed / --enddate" command line arguments to set the end date of the query.
- Added "-p / --poolsize" command line arguments which can change the number of parallel processes.
  Default number of parallel processes is set to 20.

### Improved
- Outputfile is only created if tweets are actually retrieved.

### Removed
- The ´query_all_tweets' method in the Query module is removed. Since twitterscraper is starting parallel processes by default,
  this method is no longer necessary.

### Changed
- The 'query_tweets' method now takes as arguments query, limit, begindate, enddate, poolsize.
- The 'query_tweets_once' no longer has the argument 'num_tweets'
- The default value of the 'retry' argument of the 'query_single_page' method has been increased from 3 to 10.
- The ´query_tweets_once' method does not log to screen at every single scrape, but at the end of a batch.


## 0.3.3 ( 2017-12-06 )
-----------
### Added
-PR #61: Adding --lang functionality which can retrieve tweets written in a specific language.
-PR #62: Tweet class now also contains the tweet url. This closes https://github.com/taspinar/twitterscraper/issues/59


## 0.3.2 ( 2017-11-12 )
-----------
### Improved
-PR #55: Adding --dump functionality which dumps the scraped tweets to screen, instead of an outputfile.


## 0.3.1 ( 2017-11-05 )
-----------
### Improved
-PR #49: scraping of replies, retweets and likes is improved.


## 0.3.0 ( 2017-08-01 )
-----------
### Added
- Tweet class now also includes 'replies', 'retweets' and 'likes'


## 0.2.7 ( 2017-01-10 )
-----------
### Improved
- PR #26: use ``requests`` library for HTTP requests. Makes the use of urllib2 / urllib redundant.
### Added:
- changelog.txt for GitHub
- HISTORY.rst for PyPi
- README.rst for PyPi

## 0.2.6 ( 2017-01-02 )
-----------
### Improved
- PR #25: convert date retrieved from timestamp to day precision