hBloom

Hierarchical bloom classifier for tagging text with a structured word list.

This has prpbably been solved before by smater people than myself. Given a structured tree of categories and corresponding words/strings, hBloom will create a bloom filter for each level of depth in the data, which returns true/false for all sub categories, in the case of 'true' it will return the matching categories.

An example is available at example/index.js. The example tests 5000 tweets against 16000 tag words and takes on average 4000 milliseconds. (0.8 ms per tweet)

##Install

npm install hbloom

##Usage

var hBloom = require('../hbloom');

var myBloom = hBloom( {STRUCTURED DATA} );
var txt = "This post is about celtic and rangers, but mentions villa.";

myBloom.classifyText(txt, function(result){
	console.log(result);
});

// logs: ['football', 'celtic', 'rangers', 'aston villa']

##Methods

hBloom.classifyText( text, callback )

hBloom.classify( word );

##Structured Data?

The data passed to hBloom({DATA}) should follow the example below. Where keys are tags/categories and strings in arrays or matching words.

{
	"racing": {
		"asscot": ["asscot", "ass", "the big race"]
	},
	"football": {
		"manchester united": ["manu", "man united", "mufc", "manchester united", "manufc"],
		"aston villa": ["aston villa", "villa" , "villafc"],
		"manchester city": ["mancity", "manchester city", "cityfc", "man city", "mancityfc"],
		"scottish league": {
			"dundee united": ["dundee", "dundee united", "dundeefc"],
			"rangers": ["rangers","rangersfc"],
			"celtic": ["celtic", "celticfc"]
		}
	}
}

##License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
example		example
.gitignore		.gitignore
.npmignore		.npmignore
.travis.yml		.travis.yml
README.md		README.md
hBloom.js		hBloom.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hBloom

About

Releases

Packages

Languages

christopherdebeer/hBloom

Folders and files

Latest commit

History

Repository files navigation

hBloom

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages