Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
xuchen committed May 10, 2016
0 parents commit 23758ba
Show file tree
Hide file tree
Showing 17 changed files with 564 additions and 0 deletions.
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
/lib/libsnowboy-detect.a
snowboy-detect-swig.cc
snowboydetect.py

*.pyc
*.o
*.so
117 changes: 117 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Snowboy Hotword Detection

by [KITT.AI](http://kitt.ai).

[Home Page](https://snowboy.kitt.ai)

[Full Documentation](https://snowboy.kitt.ai/docs)


Version: 1.0.0 (5/10/2016)

Snowboy is a customizable hotword detection engine for you to create your own
hotword like "OK Google" or "Alexa". It is powered by deep neural networks and has the following properties:

* **highly customizable**: you can freely define your own magic phrase here –
let it be “open sesame”, “garage door open”, or “hello dreamhouse”, you name it.

* **always listening** but protects your privacy: Snowboy does not use Internet and does *not* stream your voice to the cloud.

* light-weight and **embedded**: it even runs on a Raspberry Pi and consumes less than 10% CPU on the weakest Pi (single-core 700MHz ARMv6).

* Apache licensed!

Currently Snowboy supports:

* all versions of Raspberry Pi (with Raspbian based on Debian Jessie 8.0)
* 64bit Mac OS X
* 64bit Ubuntu (12.04 and 14.04)

It ships in the form of a **C library** with **Python** wrappers generated by SWIG. We welcome wrappers for other languages -- feel free to send a pull request!

If you want support on other hardware/OS, please send your request to [[email protected]](mailto:snowboy.kitt.ai)


## Dependencies

Snowboy's Python wrapper uses PortAudio to access your device's microphone.

### Mac OS X

`brew` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`:

brew install swig portaudio sox
pip install pyaudio

If you don't have Homebrew installed, please download it [here](http://brew.sh/). If you don't have `pip`, you can install it [here](https://pip.pypa.io/en/stable/installing/).

Make sure that you can record audio with your microphone:

rec t.wav

### Ubuntu

First `apt-get` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`:

sudo apt-get install swig3.0 python-pyaudio python3-pyaudio sox
pip install pyaudio

Then install the `atlas` matrix computing library:

sudo apt-get install libatlas-base-dev

Make sure that you can record audio with your microphone:

rec t.wav
If you need extra setup on your audio (especially on a Raspberry Pi), please see the [full documentation](https://snowboy.kitt.ai/docs).

## Compile a Python Wrapper

cd swig/python
make

SWIG will generate a `_snowboydetect.so` file and a simple (but hard-to-read) python wrapper `snowboydetect.py`. We have provided a higher level python wrapper `snowboydecoder.py` on top of that.

Feel free to adapt the `Makefile` in `swig/python` to your own system's setting if you cannot `make` it.


## Quick Start

Go to the `swig/python` folder and open your python console:

In [1]: import snowboydecoder

In [2]: def detected_callback():
....: print "hotword detected"
....:

In [3]: detector = snowboydecoder.HotwordDetector("resources/snowboy.umdl", sensitivity=0.5, audio_gain=1)

In [4]: detector.start(detected_callback)

Then speak "snowboy" to your microphone to see whetheer Snowboy detects you.

The `snowboy.umdl` file is a "universal" model that detect different people speaking "snowboy". If you want other hotwords, please go to [snowboy.kitt.ai](https://snowboy.kitt.ai) to record, train and downloand your own personal model (a `.pmdl` file).

When `sensitiviy` is higher, the hotword gets more easily triggered. But you might get more false alarms.

`audio_gain` controls whether to increase (>1) or decrease (<1) input volume.

Two demo files `demo.py` and `demo2.py` are provided to show more usages.

Note: if you see the following error:

TypeError: __init__() got an unexpected keyword argument 'model_str'

You are probably using an old version of SWIG. Please upgrade. We have tested with SWIG version 3.0.7 and 3.0.8.

## Advanced Usages & Demos

See [Full Documentation](https://snowboy.kitt.ai/docs).

## Change Log

**5/10/2016**

* initial release
112 changes: 112 additions & 0 deletions include/snowboy-detect.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
// include/snowboy-detect.h

// Copyright 2016 KITT.AI (author: Guoguo Chen)

#ifndef SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_
#define SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_

#include <memory>
#include <string>

namespace snowboy {

// Forward declaration.
struct WaveHeader;
class PipelineDetect;

////////////////////////////////////////////////////////////////////////////////
//
// SnowboyDetect class interface.
//
////////////////////////////////////////////////////////////////////////////////
class SnowboyDetect {
public:
// Constructor that takes a resource file, and a list of hotword models which
// are separated by comma. In the case that more than one hotword exist in the
// provided models, RunDetection() will return the index of the hotword, if
// the corresponding hotword is triggered.
//
// CAVEAT: a personal model only contain one hotword, but an universal model
// may contain multiple hotwords. It is your responsibility to figure
// out the index of the hotword. For example, if your model string is
// "foo.pmdl,bar.umdl", where foo.pmdl contains hotword x, bar.umdl
// has two hotwords y and z, the indices of different hotwords are as
// follows:
// x 1
// y 2
// z 3
//
// @param [in] resource_filename Filename of resource file.
// @param [in] model_str A string of multiple hotword models,
// separated by comma.
SnowboyDetect(const std::string& resource_filename,
const std::string& model_str);

// Resets the detection. This class handles voice activity detection (VAD)
// internally. But if you have an external VAD, you should call Reset()
// whenever you see segment end from your VAD.
bool Reset();

// Runs hotword detection. Supported audio format is WAVE (with linear PCM,
// 8-bits unsigned integer, 16-bits signed integer or 32-bits signed integer).
// See SampleRate(), NumChannels() and BitsPerSample() for the required
// sampling rate, number of channels and bits per sample values. You are
// supposed to provide a small chunk of data (e.g., 0.1 second) each time you
// call RunDetection(). Larger chunk usually leads to longer delay, but less
// CPU usage.
//
// Definition of return values:
// -1: Error.
// 0: No event.
// 1: Hotword 1 triggered.
// 2: Hotword 2 triggered.
// ...
//
// @param [in] data Small chunk of data to be detected. See
// above for the supported data format.
int RunDetection(const std::string& data);

// Sets the sensitivity string for the loaded hotwords. A <sensitivity_str> is
// a list of floating numbers between 0 and 1, and separated by comma. For
// example, if there are 3 loaded hotwords, your string should looks something
// like this:
// 0.4,0.5,0.8
// Make sure you properly align the sensitivity value to the corresponding
// hotword.
void SetSensitivity(const std::string& sensitivity_str);

This comment has been minimized.

Copy link
@NicoHood

NicoHood Feb 26, 2018

Wouldnt it make more sense to pass the values as actual floats via variadic function? Since its c++ you could simply overload this function and keep the old method and depreciate it slowly?
http://en.cppreference.com/w/cpp/utility/variadic

The equivalent GetSensitivity() method could make use of an input parameter index instead of returning a long string of all sensitivities. Or maybe there are even better c++ solutions, those are the ones I know.

cc @chenguoguo

This comment has been minimized.

Copy link
@chenguoguo

chenguoguo via email Feb 27, 2018

Collaborator

// Returns the sensitivity string for the current hotwords.
std::string GetSensitivity() const;

// Applied a fixed gain to the input audio. In case you have a very weak
// microphone, you can use this function to boost input audio level.
void SetAudioGain(const float audio_gain);

This comment has been minimized.

Copy link
@NicoHood

NicoHood Feb 26, 2018

What are the limits here? In pulseaudio I boost my microphone usually up to 25%. Would you set 1.25 then? It would be nice if you can add limits and suggestions of usefull boost values.

cc @chenguoguo

This comment has been minimized.

Copy link
@chenguoguo

chenguoguo via email Feb 27, 2018

Collaborator

This comment has been minimized.

Copy link
@NicoHood

NicoHood Feb 27, 2018

Sure but does 2 mean 200% boost or only 2%? That should be clarified.

This comment has been minimized.

Copy link
@chenguoguo

chenguoguo Mar 5, 2018

Collaborator

It is a coefficient that will get multiplied to the actual audio. So 2 means 200% in this sense.

This comment has been minimized.

Copy link
@NicoHood

NicoHood Mar 5, 2018

Thanks a lot for clarification. Could you please add this to the header file? I guess more people would like to know this info and are not browsing those old commits/comments. It would help a lot :)


// Writes the models to the model filenames specified in <model_str> in the
// constructor. This overwrites the original model with the latest parameter
// setting. You are supposed to call this function if you have updated the
// hotword sensitivities through SetSensitivity(), and you would like to store
// those values in the model as the default value.
void UpdateModel() const;

// Returns the number of the loaded hotwords. This helps you to figure the
// index of the hotwords.
int NumHotwords() const;

// Returns the required sampling rate, number of channels and bits per sample
// values for the audio data. You should use this information to set up your
// audio capturing interface.
int SampleRate() const;
int NumChannels() const;
int BitsPerSample() const;

~SnowboyDetect();

private:
std::unique_ptr<WaveHeader> wave_header_;
std::unique_ptr<PipelineDetect> detect_pipeline_;
};

} // namespace snowboy

#endif // SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_
Binary file added lib/ios/libsnowboy-detect.a
Binary file not shown.
Binary file added lib/osx/libsnowboy-detect.a
Binary file not shown.
Binary file added lib/ubuntu64/libsnowboy-detect.a
Binary file not shown.
Binary file added resources/common.res
Binary file not shown.
Binary file added resources/ding.wav
Binary file not shown.
Binary file added resources/dong.wav
Binary file not shown.
Binary file added resources/snowboy.umdl
Binary file not shown.
52 changes: 52 additions & 0 deletions swig/python/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Example Makefile that converts snowboy c++ library (snowboy-detect.a) to
# python library (_snowboydetect.so, snowboydetect.py), using swig.

# Some versions of swig does not work well. We prefer compiling swig from source
# code. We have tested swig-3.0.7.tar.gz.
SWIG := swig

SNOWBOYDETECTSWIGITF = snowboy-detect-swig.i
SNOWBOYDETECTSWIGOBJ = snowboy-detect-swig.o
SNOWBOYDETECTSWIGCC = snowboy-detect-swig.cc
SNOWBOYDETECTSWIGLIBFILE = _snowboydetect.so

TOPDIR := ../../
CXXFLAGS := -I$(TOPDIR) -O3 -fPIC
LDFLAGS :=

ifeq ($(shell uname), Darwin)
CXX := clang++
PYINC := $(shell /usr/bin/python2.7-config --includes)
PYLIBS := $(shell /usr/bin/python2.7-config --ldflags)
SWIGFLAGS := -bundle -flat_namespace -undefined suppress
LDLIBS := -lm -ldl -framework Accelerate
SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/osx/libsnowboy-detect.a
else
CXX := g++
PYINC := $(shell python-config --cflags)
PYLIBS := $(shell python-config --ldflags)
SWIGFLAGS := -shared
CXXFLAGS += -std=c++0x
# Make sure you have Atlas installed. You can statically link Atlas if you
# would like to be able to move the library to a machine without Atlas.
LDLIBS := -lm -ldl -lf77blas -lcblas -llapack_atlas -latlas
SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/ubuntu64/libsnowboy-detect.a
endif

all: $(SNOWBOYSWIGLIBFILE) $(SNOWBOYDETECTSWIGLIBFILE)

%.a:
$(MAKE) -C ${@D} ${@F}

$(SNOWBOYDETECTSWIGCC): $(SNOWBOYDETECTSWIGITF)
$(SWIG) -I$(TOPDIR) -c++ -python -o $(SNOWBOYDETECTSWIGCC) $(SNOWBOYDETECTSWIGITF)

$(SNOWBOYDETECTSWIGOBJ): $(SNOWBOYDETECTSWIGCC)
$(CXX) $(PYINC) $(CXXFLAGS) -c $(SNOWBOYDETECTSWIGCC)

$(SNOWBOYDETECTSWIGLIBFILE): $(SNOWBOYDETECTSWIGOBJ) $(SNOWBOYDETECTLIBFILE)
$(CXX) $(CXXFLAGS) $(LDFLAGS) $(SWIGFLAGS) $(SNOWBOYDETECTSWIGOBJ) \
$(SNOWBOYDETECTLIBFILE) $(PYLIBS) $(LDLIBS) -o $(SNOWBOYDETECTSWIGLIBFILE)

clean:
-rm -f *.o *.a *.so snowboydetect.py *.pyc $(SNOWBOYDETECTSWIGCC)
35 changes: 35 additions & 0 deletions swig/python/demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import snowboydecoder
import sys
import signal

interrupted = False


def signal_handler(signal, frame):
global interrupted
interrupted = True


def interrupt_callback():
global interrupted
return interrupted

if len(sys.argv) == 1:
print("Error: need to specify model name")
print("Usage: python demo.py your.model")
sys.exit(-1)

model = sys.argv[1]

# capture SIGINT signal, e.g., Ctrl+C
signal.signal(signal.SIGINT, signal_handler)

detector = snowboydecoder.HotwordDetector(model, sensitivity=0.5)
print('Listening... Press Ctrl+C to exit')

# main loop
detector.start(detected_callback=snowboydecoder.play_audio_file,
interrupt_check=interrupt_callback,
sleep_time=0.03)

detector.terminate()
41 changes: 41 additions & 0 deletions swig/python/demo2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import snowboydecoder
import sys
import signal

# Demo code for listening two hotwords at the same time

interrupted = False


def signal_handler(signal, frame):
global interrupted
interrupted = True


def interrupt_callback():
global interrupted
return interrupted

if len(sys.argv) != 3:
print("Error: need to specify 2 model names")
print("Usage: python demo.py 1st.model 2nd.model")
sys.exit(-1)

models = sys.argv[1:]

# capture SIGINT signal, e.g., Ctrl+C
signal.signal(signal.SIGINT, signal_handler)

sensitivity = [0.5]*len(models)
detector = snowboydecoder.HotwordDetector(models, sensitivity=sensitivity)
callbacks = [lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DING),
lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DONG)]
print('Listening... Press Ctrl+C to exit')

# main loop
# make sure you have the same numbers of callbacks and models
detector.start(detected_callback=callbacks,
interrupt_check=interrupt_callback,
sleep_time=0.03)

detector.terminate()
1 change: 1 addition & 0 deletions swig/python/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PyAudio==0.2.9
1 change: 1 addition & 0 deletions swig/python/resources
15 changes: 15 additions & 0 deletions swig/python/snowboy-detect-swig.i
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// swig/snowboy-detect-swig.i

// Copyright 2016 KITT.AI (author: Guoguo Chen)

%module snowboydetect

// Suppress SWIG warnings.
#pragma SWIG nowarn=SWIGWARN_PARSE_NESTED_CLASS
%include "std_string.i"

%{
#include "include/snowboy-detect.h"
%}

%include "include/snowboy-detect.h"
Loading

0 comments on commit 23758ba

Please sign in to comment.