Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: yooper/php-text-analysis
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 1.6.1
Choose a base ref
...
head repository: yooper/php-text-analysis
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
  • 10 commits
  • 16 files changed
  • 4 contributors

Commits on May 17, 2021

  1. now supports PHP 8.0

    yooper committed May 17, 2021
    Copy the full SHA
    7d00848 View commit details

Commits on Aug 1, 2022

  1. supports php version 8.1

    yooper committed Aug 1, 2022
    Copy the full SHA
    6d61487 View commit details

Commits on Jun 28, 2023

  1. Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    985cd57 View commit details

Commits on Jul 4, 2023

  1. Merge pull request #73 from nielsriekert/master

    fixed uninitialized string offset for `Porter::doubleConsonant($str)`
    yooper authored Jul 4, 2023

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    d607f39 View commit details
  2. Copy the full SHA
    08cf657 View commit details

Commits on Sep 22, 2023

  1. Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    8139f71 View commit details

Commits on Sep 23, 2023

  1. Copy the full SHA
    37b5a28 View commit details
  2. Merge pull request #75 from yooper/fixed_pr_issues

    Fixed pr issues
    yooper authored Sep 23, 2023

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    9b96d25 View commit details

Commits on Dec 28, 2024

  1. Copy the full SHA
    c97fc51 View commit details
  2. Merge pull request #79 from yooper/feature/test-new-phps

    update docker files used for testing
    yooper authored Dec 28, 2024

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature.
    Copy the full SHA
    7106cef View commit details
17 changes: 17 additions & 0 deletions Dockerfile80
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM php:8.0-cli

RUN apt-get update && \
apt-get install -y --no-install-recommends zip libzip-dev libpspell-dev && \
docker-php-ext-install zip pspell

RUN curl --silent --show-error https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer

RUN mkdir -p /app

COPY ./ /app

RUN composer --working-dir=/app install

RUN cd /app && SKIP_TEST=1 ./vendor/bin/phpunit -d memory_limit=1G

CMD ["/bin/sh"]
17 changes: 17 additions & 0 deletions Dockerfile81
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM php:8.1-cli

RUN apt-get update && \
apt-get install -y --no-install-recommends zip libzip-dev libpspell-dev && \
docker-php-ext-install zip pspell

RUN curl --silent --show-error https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer

RUN mkdir -p /app

COPY ./ /app

RUN composer --working-dir=/app install

RUN cd /app && SKIP_TEST=1 ./vendor/bin/phpunit -d memory_limit=1G

CMD ["/bin/sh"]
17 changes: 17 additions & 0 deletions Dockerfile82
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM php:8.2-cli

RUN apt-get update && \
apt-get install -y --no-install-recommends zip libzip-dev libpspell-dev && \
docker-php-ext-install zip pspell

RUN curl --silent --show-error https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer

RUN mkdir -p /app

COPY ./ /app

RUN composer --working-dir=/app install

RUN cd /app && SKIP_TEST=1 ./vendor/bin/phpunit -d memory_limit=1G

CMD ["/bin/sh"]
17 changes: 17 additions & 0 deletions Dockerfile83
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM php:8.3-cli

RUN apt-get update && \
apt-get install -y --no-install-recommends zip libzip-dev libpspell-dev && \
docker-php-ext-install zip pspell

RUN curl --silent --show-error https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer

RUN mkdir -p /app

COPY ./ /app

RUN composer --working-dir=/app install

RUN cd /app && SKIP_TEST=1 ./vendor/bin/phpunit -d memory_limit=1G

CMD ["/bin/sh"]
17 changes: 17 additions & 0 deletions Dockerfile84
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM php:8.4-cli

RUN apt-get update && \
apt-get install -y --no-install-recommends zip libzip-dev libpspell-dev && \
docker-php-ext-install zip

RUN curl --silent --show-error https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer

RUN mkdir -p /app

COPY ./ /app

RUN composer --working-dir=/app install

RUN cd /app && SKIP_TEST=1 ./vendor/bin/phpunit -d memory_limit=1G

CMD ["/bin/sh"]
7 changes: 3 additions & 4 deletions composer.json
Original file line number Diff line number Diff line change
@@ -20,12 +20,11 @@
"files": ["tests/TestBaseCase.php"]
},
"require" : {
"php": "~7.4",
"php": ">=7.4",
"yooper/stop-words": "~1",
"symfony/console": ">= 4.4",
"wamania/php-stemmer": "~1",
"yooper/nicknames": "~1",
"vanderlee/php-sentence": "~1"
"wamania/php-stemmer": "^1.0 || ^2.0 || ^3.0",
"yooper/nicknames": "~1"
},
"require-dev": {
"phpunit/phpunit": "^9",
2 changes: 1 addition & 1 deletion src/Adapters/PspellAdapter.php
Original file line number Diff line number Diff line change
@@ -13,7 +13,7 @@ class PspellAdapter implements ISpelling
{
protected $pSpell = null;

public function __construct($language = 'en', $spelling = "", $jargon = "", $encoding = "", $mode = PSPELL_BAD_SPELLERS )
public function __construct($language = 'en', $spelling = "", $jargon = "", $encoding = "", $mode = \PSPELL_BAD_SPELLERS )
{
$this->pSpell = pspell_new($language, $spelling, $jargon, $encoding, $mode);
}
24 changes: 8 additions & 16 deletions src/Collections/DocumentArrayCollection.php
Original file line number Diff line number Diff line change
@@ -126,43 +126,35 @@ public function current()
*
* @param mixed $key
* @param DocumentAbstract $value
* @return boolean
* @return void
*/
public function offsetSet($key, $value)
public function offsetSet($key, $value) : void
{
if(!isset($key)) {
$this->documents[] = $value;
return true;
}

$this->documents[$key] = $value;
return $value;


}

/**
*
* @param mixed $key
* @return null
* @return void
*/
public function offsetUnset($key)
public function offsetUnset($key) : void
{
if (isset($this->documents[$key])) {
$deleted = $this->documents[$key];
unset($this->documents[$key]);

return $deleted;
}
return null;
}

/**
*
* @param mixed $key
* @return DocumentAbstract
*/
public function offsetGet($key)
public function offsetGet($key) : DocumentAbstract
{
return $this->documents[$key];
}
@@ -172,7 +164,7 @@ public function offsetGet($key)
* @param mixed $key
* @return boolean
*/
public function offsetExists($key)
public function offsetExists($key) : bool
{
return isset($this->documents[$key]);
}
@@ -181,7 +173,7 @@ public function offsetExists($key)
*
* @return int
*/
public function count()
public function count() : int
{
return count($this->documents);
}
@@ -190,7 +182,7 @@ public function count()
*
* @return \ArrayIterator
*/
public function getIterator()
public function getIterator() : \ArrayIterator
{
return new \ArrayIterator($this->documents);
}
44 changes: 25 additions & 19 deletions src/Sentiment/Vader.php
Original file line number Diff line number Diff line change
@@ -83,17 +83,19 @@ public function __construct()
* Add a new token and score to the lexicon
* @param string $token
* @param float $meanSentimentRating
* @return void
*/
public function addToLexicon(string $token, float $meanSentimentRating)
public function addToLexicon(string $token, float $meanSentimentRating) : void
{
$this->lexicon[$token] = $meanSentimentRating;
}

/**
* Remove a token from the lexicon
* @param string $token
* @return void
*/
public function deleteFromLexicon(string $token)
public function deleteFromLexicon(string $token) : void
{
unset($this->lexicon[$token]);
}
@@ -124,12 +126,13 @@ public function isNegated(array $tokens, bool $includeNt = true) : bool
* approximates the max expected value
* @param float $score
* @param int $alpha
* @return float
*/
public function normalize(float $score, int $alpha=15)
public function normalize(float $score, int $alpha=15) : float
{
$normalizedScore = $score;

if (sqrt(($score^2) + $alpha > 0)) {
if (sqrt(($score^2) + $alpha) > 1) {
$normalizedScore = $score/sqrt(($score^2) + $alpha);
}

@@ -178,7 +181,7 @@ public function scalarIncDec(string $word, float $valence, bool $isCapDiff)
public function getPolarityScores(array $tokens) : array
{
$sentiments = [];
for($index = 0; $index < count($tokens); $index++)
for($index = 0, $indexMax = count($tokens); $index < $indexMax; $index++)
{
$valence = 0.0;
$lcToken = strtolower($tokens[$index]);
@@ -196,7 +199,7 @@ public function getPolarityScores(array $tokens) : array
return $this->scoreValence($sentiments, $tokens);
}

public function scoreValence(array $sentiments, array $tokens)
public function scoreValence(array $sentiments, array $tokens): array
{
if ( !empty($sentiments)) {
$sentimentSum = array_sum($sentiments);
@@ -247,7 +250,7 @@ public function getSentimentValence(float $valence, array $tokens, int $index)
if(isset($this->getLexicon()[$lcToken]))
{
//get the sentiment valence
$valence = $this->getLexicon()[$lcToken];
$valence = (float)$this->getLexicon()[$lcToken];
//check if sentiment laden word is in ALL CAPS (while others aren't)
if ($ucToken and $isCapDiff) {
if ($valence > 0) {
@@ -262,17 +265,17 @@ public function getSentimentValence(float $valence, array $tokens, int $index)
if ($index > $startIndex && !isset($this->getLexicon()[ strtolower($tokens[$index-($startIndex+1)])]))
{
// dampen the scalar modifier of preceding words and emoticons
// (excluding the ones that immediately preceed the item) based
// (excluding the ones that immediately preceded the item) based
// on their distance from the current item.
$s = $this->scalarIncDec($tokens[$index-($startIndex+1)], $valence, $isCapDiff);
if ($startIndex == 1 and $s != 0) {
if ($startIndex === 1 and $s !== 0) {
$s *= 0.95;
} elseif ($startIndex == 2 and $s != 0 ) {
} elseif ($startIndex === 2 and $s !== 0 ) {
$s *= 0.9;
}
$valence += $s;
$valence = $this->neverCheck($valence, $tokens, $startIndex, $index);
if ($startIndex == 2) {
if ($startIndex === 2) {
$valence = $this->idiomsCheck($valence, $tokens, $index);
}

@@ -283,7 +286,7 @@ public function getSentimentValence(float $valence, array $tokens, int $index)
// "cooking with gas": 2, "in the black": 2, "in the red": -2,
// "on the ball": 2,"under the weather": -2}
}
$valence = $this->leastCheck($valence, $tokens, $index);
$valence = $this->leastCheck((float)$valence, $tokens, $index);
}

}
@@ -356,7 +359,7 @@ public function butCheck(array $tokens, array $sentiments)
return $sentiments;
}

for($i = 0; $i < count($sentiments); $i++)
for($i = 0, $iMax = count($sentiments); $i < $iMax; $i++)
{
if( $index < $i) {
$sentiments[$i] *= 0.5;
@@ -444,7 +447,7 @@ public function boostExclamationPoints(array $tokens) : float
return 0.0;
}

public function neverCheck(float $valence, array $tokens, int $startIndex, int $index)
public function neverCheck(float $valence, array $tokens, int $startIndex, int $index): float
{
if($startIndex == 0 && $this->isNegated([$tokens[$index-1]])) {
$valence *= self::N_SCALAR;
@@ -460,8 +463,8 @@ public function neverCheck(float $valence, array $tokens, int $startIndex, int $
$valence *= self::N_SCALAR;
}
} elseif($startIndex == 2) {
if ($tokens[$index-3] == "never" &&
($tokens[$index-2] == "so" || $tokens[$index-2] == "this") ||
if (($tokens[$index - 3] == "never" &&
($tokens[$index - 2] == "so" || $tokens[$index - 2] == "this")) ||
($tokens[$index-1] == "so" || $tokens[$index-1] == "this")) {

$valence *= 1.25;
@@ -487,9 +490,12 @@ public function boostQuestionMarks(array $tokens) : float
}
}
return 0.0;
}


}


/**
* @throws \Exception
*/
protected function getTxtFilePath() : string
{
return get_storage_path('sentiment'.DIRECTORY_SEPARATOR.'vader_lexicon').DIRECTORY_SEPARATOR.'vader_lexicon.txt';
7 changes: 6 additions & 1 deletion src/Stemmers/PorterStemmer.php
Original file line number Diff line number Diff line change
@@ -437,7 +437,12 @@ private static function doubleConsonant($str)
{
$c = self::$regex_consonant;

return preg_match("#$c[2]$#", $str, $matches) AND $matches[0][0] == $matches[0][1];
$result = preg_match("#$c[2]$#", $str, $matches);

$sub_0 = count($matches) > 0 ? substr($matches[0], 0) : false;
$sub_1 = count($matches) > 0 ? substr($matches[0], 1) : false;

return $result AND is_string($sub_0) AND is_string($sub_1) AND $sub_0 == $sub_1;
}

/**
28 changes: 20 additions & 8 deletions src/Stemmers/SnowballStemmer.php
Original file line number Diff line number Diff line change
@@ -4,6 +4,8 @@
namespace TextAnalysis\Stemmers;

use TextAnalysis\Interfaces\IStemmer;
use Wamania\Snowball\StemmerFactory;


/**
* A wrapper around PHP native snowball implementation
@@ -18,17 +20,27 @@ class SnowballStemmer implements IStemmer
* @var \Wamania\Snowball\Stem
*/
protected $stemmer;

public function __construct($stemmerType = 'English')

/**
* @throws \Wamania\Snowball\NotFoundException
*/
public function __construct(string $stemmerType = 'English')
{
$className = self::BASE_NAMESPACE.$stemmerType;
if(!class_exists($className)) {
throw new \RuntimeException("Class {$stemmerType} does not exist");
$version = (int)\Composer\InstalledVersions::getVersion('wamania/php-stemmer')[0];
if($version === 1){
$className = self::BASE_NAMESPACE.$stemmerType;
if(!class_exists($className)) {
throw new \RuntimeException("Class {$stemmerType} does not exist");
}
$this->stemmer = new $className();
}
// support version 2 and above
else {
$this->stemmer = StemmerFactory::create (strtolower($stemmerType));
}
$this->stemmer = new $className();
}
public function stem($token)

public function stem($token) : string
{
return $this->stemmer->stem($token);
}
Loading