forked from tlk00/BitMagic
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathreadme
187 lines (116 loc) · 6.18 KB
/
readme
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
BitMagic Library
Algorithms and tools for integer set algebra operations used for information retrieval,
indexing of databases, scientific algorithms, ranking, clustering and signal processing.
BitMagic library uses compressed bit-vectors as a main vehicle for implementing set algebraic operations,
because of high efficiency and bit-level parallelism of this representation.
To compress memory it uses delta / prefix sum coding. One of our goals is constant improvement of
performance via SIMD vectorization (SSE2, SSE4.2, AVX2), CPU cache-friendly algorithms and
data-parallel thread-safe structures.
Features:
- compressed bit-vector container with mechanisms to iterate integer set it represents
- set algebraic operations: AND, OR, XOR, MINUS on bit-vectors and integer sets
- serialization/hybernation of bit-vector containers into compressed BLOBs for persistence (or in-RAM compression)
- set algebraic operations on compressed BLOBs (on the fly deserialization with set-algebraic function)
- statistical algorithms to efficiently construct similarity and distance metrics, measure similarity between bit-vectors,
integer sets and compressed BLOBs
- operations with rank: population count distances on bit-vector
- sparse vector(s) for native int types using bit transposition and separate compression of bit-plains,
with support of NULL values (unassigned) for construction of in-memory columnar structures. Bit-transposed
sparse vectors can be used for on-the fly compression of astronomical, molecular biology or other data,
efficient store of associations for graphs, etc.
- algorithms on sparse vectors: dynamic range clipping, search, group theory image (re-mapping).
Collection of algorithms is increasing, please check our samples and the API lists.
Features In Progress:
- compressed binary relational and adjacency matrixes and operations on matrixes for Entity-Relationship acceleration, graph operations, materialized RDBMS joins, etc
- portable C-library layer working as a bridge to high level languages like Python, Java, Scala, .Net
Please visit our repository at:
https://github.com/tlk00/BitMagicC
License:
- Apache 2.0
How to build BitMagic library:
BitMagic C++ is a header-only software package and you probably can just take the
sources and put it into your project directly. All library sources are in src
directory.
However if you want to use our makefiles you need to follow the next simple
instructions:
Unix:
-----
1. Traditional (in-place build)
- Apply environment variables by runing bmenv.sh :
$ . bmenv.sh
- use GNU make (gmake) to build installation.
$gmake rebuild
or (DEBUG version)
$gmake DEBUG=YES rebuild
The default compiler on Unix and CygWin is g++.
If you want to change the default you can do that in makefile.in
(should be pretty easy to do)
2. CMake based build
Project now comes with a set of makefiles for cmake, you can just build it or generate project files for any cmake-supported
environment.
Windows:
--------
If you use cygwin installation please follow general Unix recommendations.
MSVC - solution and projects are available via CMAKE generation
MacOS
---------
XCODE - project files are available via CMAKE generation
=================================================================================
API documentation and examples:
http://www.bitmagic.io/apis.html
Fine tuning and optimizations:
------------------------------
All BM fine tuning parameters are controlled by the preprocessor defines.
=================================================================================
BM library supports CXX-11. Move semantics, noexept, etc.
use
#define BM_NO_CXX11
to explicitly disable use of CXX11 features for your build.
=================================================================================
BM library includes some code optimized for 64-bit systems. This optimization
gets applied automatically.
BM library contains hand tuned code (intrinsics) for SIMD extensions SSE2, SSE4.2, AVX2.
To turn on SSE2 optimization #define BMSSE2OPT in your build environment.
To use SSE4.2 #define BMSSE42OPT (this enables hardware popcount via intrinsics).
SSE42 optimization automatically assumes SSE2 as a subset of SSE4.2.
(you don’t need to use both BMSSE2OPT and BMSSE42OPT).
You will need compiler supporting Intel SIMD intrinsics (MSVC, GCC - are ok).
To turn on AVX2 - #define BMAVX2OPT
This will automatically enable AVX2 256-bit SIMD, popcount (SSE4.2) and other
compatible hardware instructions.
BM library does NOT support multiple code paths and runtime CPU identification.
You have to build specifically for your target system or use default portable
build.
To correctly build for the target SIMD instruction set - please set correct
code generation flags for the build environment.
BitMagic examples and tests can be build with GCC using cmd-line settings:
make BMOPTFLAGS=-DBMAVX2OPT rebuild
or
make BMOPTFLAGS=-DBMSSE42OPT rebuild
It automatically applies the right set of compiler (GCC) flags for the target
build.
CMAKE
cd build
cmake -DBMOPTFLAGS:STRING=BMSSE42OPT ..
make
OR
cmake -DBMOPTFLAGS:STRING=BMAVX2OPT ..
=================================================================================
BM library supports "restrict" keyword, some compilers
(for example Intel C++) generate better
code (out of order load-stores) when restrict keyword is helping. This option is
turned OFF by default since most of the C++ compilers does not support it.
To turn it ON please #define BM_HASRESTRICT in your project. Some compilers
use "__restrict" keyword for this purpose. To correct it define BMRESTRICT macro
to correct keyword.
=================================================================================
If you want to use BM library in STL-free project you need to define
BM_NO_STL variable. It will disable inclusion of certain headers and also
will make bm::bvector<> iterators incompatible with STL algorithms
(which you said you are not using anyway).
This rule only applies to the core bm::bvector<> methods.
Auxiliary algorithms may still use STL.
=================================================================================
Thank you for using BitMagic library!
e-mail: [email protected]
WEB site: http://bitmagic.io