Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Java implementation #8

Merged
merged 6 commits into from
Mar 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ on:

env:
PYTHON_VERSION: 3.x
JAVA_VERSION: '11'

permissions:
contents: read
Expand Down Expand Up @@ -57,6 +58,21 @@ jobs:
- name: test
run: cd go && make mod deps linter test GOPATH=$(go env GOPATH)

java:
runs-on: ubuntu-latest
steps:
- name: checkout repository
uses: actions/checkout@v3
- name: setup java build environment
uses: actions/setup-java@v2
with:
distribution: 'adopt'
java-version: ${{ env.JAVA_VERSION }}
- name: set RELEASE number
run: echo ${GITHUB_RUN_NUMBER} > RELEASE
- name: test
run: cd java && make build test

python:
runs-on: ubuntu-latest
steps:
Expand Down
8 changes: 5 additions & 3 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,13 @@ The code in this project is a C port of the Fingerprint64
(farmhashna::Hash64) code from Google's FarmHash
(https://github.com/google/farmhash).

This code has been ported/translated by Nicola Asuni to header-only C code.
This code has been ported/translated by Nicola Asuni to multiple languages.

The original code is released under the MIT License:
MIT License:

Copyright (c) 2014 Google, Inc.
- Copyright (c) 2014 Google, Inc.
- Copyright (c) 2014 Damian Gryski (original GO version)
- Copyright (c) 2016-2024 Nicola Asuni (versions in CGO, GO, Java, Python, Rust)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
9 changes: 8 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,15 @@ help:
@echo " make c : Build and test the C version"
@echo " make cgo : Build and test the GO C-wrapper version"
@echo " make go : Build and test the GO version"
@echo " make java : Build and test the Java version"
@echo " make python : Build and test the Python version"
@echo " make rust : Build and test the Rust version"
@echo " make clean : Remove any build artifact"
@echo " make dbuild : Build everything inside a Docker container"
@echo " make tag : Tag the Git repository"
@echo ""

all: clean c cgo go python rust
all: clean c cgo go java python rust

# Build and test the C version
.PHONY: c
Expand All @@ -75,6 +76,11 @@ cgo:
go:
cd go && make all

# Build and test the Java version
.PHONY: java
java:
cd java && make all

# Build and test the Python version
.PHONY: python
python:
Expand All @@ -92,6 +98,7 @@ clean:
cd c && make clean
cd cgo && make clean
cd go && make clean
cd java && make clean
cd python && make clean
cd rust && make clean
@mkdir -p $(TARGETDIR)
Expand Down
35 changes: 12 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# FarmHash64

*Provides farmhash64, a portable C 64-bit hash function*
*Provides farmhash64 and farmhash32 hash functions in multiple languages*

[![Donate via PayPal](https://img.shields.io/badge/donate-paypal-87ceeb.svg)](https://www.paypal.com/cgi-bin/webscr?cmd=_donations&currency_code=GBP&[email protected]&item_name=donation%20for%20farmhash64%20project)
*Please consider supporting this project by making a donation via [PayPal](https://www.paypal.com/cgi-bin/webscr?cmd=_donations&currency_code=GBP&[email protected]&item_name=donation%20for%20farmhash64%20project)*
Expand All @@ -19,17 +19,22 @@

FarmHash is a family of hash functions.

This is a C translation of the Fingerprint64 (farmhashna::Hash64) code from Google's FarmHash
(https://github.com/google/farmhash).
FarmHash64 is a 64-bit fingerprint hash function that produces a hash value for a given string.
It is designed to be fast and provide good hash distribution but is not suitable for cryptography applications.

FarmHash64 provides a portable 64-bit hash function for strings (byte array).
The function mix the input bits thoroughly but is not suitable for cryptography.
The FarmHash32 function is also provided, which returns a 32-bit fingerprint hash for a string.

All members of the FarmHash family were designed with heavy reliance on previous work by Jyrki Alakuijala, Austin Appleby, Bob Jenkins, and others.
This is a Java port of the Fingerprint64 (farmhashna::Hash64) code from Google's FarmHash (https://github.com/google/farmhash).

For more information please consult https://github.com/google/farmhash

This code has been ported/translated by Nicola Asuni (Tecnick.com) to multiple languages:

- C (header-only)
- CGO
- GO
- Java
- Python
- Rust

## Getting Started

Expand All @@ -44,19 +49,3 @@ make help
```

use the command ```make all``` to build and test all the implementations.


### Python Usage Example

```
# copy this code in the same directory of farmhash64 library

import farmhash64 as vh

print('\nUSAGE EXAMPLE:\n')

vhash = vh.farmhash64("Lorem ipsum dolor sit amet")
print('vh.farmhash64("Lorem ipsum dolor sit amet")')
print("Variant Hash (DEC): %d" % vhash)
print("Variant Hash (HEX): %x\n" % vhash)
```
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.5.1
1.6.0
2 changes: 1 addition & 1 deletion c/doc/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ PROJECT_NAME = FarmHash64
# could be handy for archiving the generated documentation or if some version
# control system is used.

PROJECT_NUMBER = 1.5.1
PROJECT_NUMBER = 1.6.0

# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a
Expand Down
77 changes: 10 additions & 67 deletions c/src/farmhash64.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,44 +4,15 @@
*
* FarmHash is a family of hash functions.
*
* FarmHash64 provides a portable 64-bit hash function for strings (byte array).
* The function mix the input bits thoroughly but is not suitable for cryptography.
* FarmHash64 is a 64-bit fingerprint hash function that produces a hash value for a given string.
* It is designed to be fast and provide good hash distribution but is not suitable for cryptography applications.
*
* All members of the FarmHash family were designed with heavy reliance on previous work by Jyrki Alakuijala, Austin Appleby, Bob Jenkins, and others.
* For more information please consult https://github.com/google/farmhash
*
* This is a C port of the Fingerprint64 (farmhashna::Hash64) code
* from Google's FarmHash (https://github.com/google/farmhash).
*
* This code has been ported/translated by Nicola Asuni to header-only C code.
*
* The public functions are:
* - farmhash64: Returns a 64-bit fingerprint hash for a byte array.
* - farmhash32: Returns a 32-bit fingerprint hash for a byte array.
*
* The original C++ code is released under the MIT License:
*
* The MIT License (MIT)
*
* Copyright (c) 2014 Google, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
* The FarmHash32 function is also provided, which returns a 32-bit fingerprint hash for a string.
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
* All members of the FarmHash family were designed with heavy reliance on previous work by Jyrki Alakuijala, Austin Appleby, Bob Jenkins, and others.
* This is a C port of the Fingerprint64 (farmhashna::Hash64) code from Google's FarmHash (https://github.com/google/farmhash).
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
* This code has been ported/translated by Nicola Asuni (Tecnick.com) to header-only C code.
*/

#ifndef FARMHASH64_H
Expand Down Expand Up @@ -445,34 +416,6 @@ static const uint64_t k2 = 0x9ae16a3b2f90404fULL;
static const uint32_t c1 = 0xcc9e2d51;
static const uint32_t c2 = 0x1b873593;

/**
* @brief Get the low 64 bits of a uint128_t value.
*
* @param x uint128_t value
*
* @return The low 64 bits of x
*
* @private
*/
STATIC_INLINE uint64_t uint128_t_low64(const uint128_t x)
{
return x.lo;
}

/**
* @brief Get the high 64 bits of a uint128_t value.
*
* @param x uint128_t value
*
* @return The high 64 bits of x
*
* @private
*/
STATIC_INLINE uint64_t uint128_t_high64(const uint128_t x)
{
return x.hi;
}

/**
* @brief Create a uint128_t value from two 64-bit integers.
*
Expand Down Expand Up @@ -506,9 +449,9 @@ STATIC_INLINE uint64_t farmhash128_to_64(uint128_t x)
{
// Murmur-inspired hashing.
const uint64_t k_mul = 0x9ddfea08eb382d69ULL;
uint64_t a = (uint128_t_low64(x) ^ uint128_t_high64(x)) * k_mul;
uint64_t a = (x.lo ^ x.hi) * k_mul;
a ^= (a >> 47);
uint64_t b = (uint128_t_high64(x) ^ a) * k_mul;
uint64_t b = (x.hi ^ a) * k_mul;
b ^= (b >> 47);
b *= k_mul;
return b;
Expand Down Expand Up @@ -539,11 +482,11 @@ STATIC_INLINE uint64_t fetch64(const char* p)
*
* @private
*/
STATIC_INLINE uint32_t fetch32(const char* p)
STATIC_INLINE uint64_t fetch32(const char* p)
{
uint32_t result;
memcpy(&result, p, sizeof(result));
return uint32_t_in_expected_order(result);
return uint64_t_in_expected_order(result);
}

/**
Expand Down
18 changes: 7 additions & 11 deletions cgo/src/farmhash64.go
Original file line number Diff line number Diff line change
@@ -1,21 +1,17 @@
/*
Package farmhash64 implements the FarmHash64 hash functions for strings.
Package farmhash64 implements the FarmHash64 and FarmHash32 hash functions for strings.

FarmHash is a family of hash functions.

FarmHash64 is a 64-bit fingerprint hash function that produces a hash value for a given string.
It is designed to be fast and provide good hash distribution.
It is designed to be fast and provide good hash distribution but is not suitable for cryptography applications.

The FarmHash32 function is also provided, which returns a 32-bit fingerprint hash for a string.

Usage:

To use the FarmHash64 function, pass a byte slice representing the string to be hashed.
The function returns a uint64 value representing the hash.

Note:
The package uses cgo to interface with the C implementation of FarmHash64.
All members of the FarmHash family were designed with heavy reliance on previous work by Jyrki Alakuijala, Austin Appleby, Bob Jenkins, and others.
This is a CGO port of the Fingerprint64 (farmhashna::Hash64) code from Google's FarmHash (https://github.com/google/farmhash).

For more information about FarmHash64, refer to the original C implementation:
https://github.com/google/farmhash
This code has been ported/translated by Nicola Asuni (Tecnick.com) to CGO code.
*/
package farmhash64

Expand Down
61 changes: 27 additions & 34 deletions go/src/farmhash64.go
Original file line number Diff line number Diff line change
@@ -1,33 +1,17 @@
/*
Package farmhash64 implements the FarmHash64 hash function.

The code in this file is an extract from:
https://github.com/dgryski/go-farm/commits/master

That is a golang translation of the Google's C++ code:
https://github.com/google/farmhash

- Copyright (c) 2014 Google, Inc.
- Copyright (c) 2014 Damian Gryski
- Copyright (c) 2016-2024 Nicola Asuni

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Package farmhash64 implements the FarmHash64 and FarmHash32 hash functions for strings.

FarmHash is a family of hash functions.

FarmHash64 is a 64-bit fingerprint hash function that produces a hash value for a given string.
It is designed to be fast and provide good hash distribution but is not suitable for cryptography applications.

The FarmHash32 function is also provided, which returns a 32-bit fingerprint hash for a string.

All members of the FarmHash family were designed with heavy reliance on previous work by Jyrki Alakuijala, Austin Appleby, Bob Jenkins, and others.
This is a GO port of the Fingerprint64 (farmhashna::Hash64) code from Google's FarmHash (https://github.com/google/farmhash).

This code has been ported/translated by Nicola Asuni (Tecnick.com) to GO code.
*/
package farmhash64

Expand All @@ -54,15 +38,23 @@ type uint128 struct {
// PLATFORM

func rotate32(val uint32, shift uint) uint32 {
if shift == 0 {
return val
}

return ((val >> shift) | (val << (32 - shift)))
}

func rotate64(val uint64, shift uint) uint64 {
if shift == 0 {
return val
}

return ((val >> shift) | (val << (64 - shift)))
}

func fetch32(s []byte, idx int) uint32 {
return uint32(s[idx+0]) | uint32(s[idx+1])<<8 | uint32(s[idx+2])<<16 | uint32(s[idx+3])<<24
func fetch32(s []byte, idx int) uint64 {
return uint64(s[idx+0]) | uint64(s[idx+1])<<8 | uint64(s[idx+2])<<16 | uint64(s[idx+3])<<24
}

func fetch64(s []byte, idx int) uint64 {
Expand Down Expand Up @@ -120,7 +112,7 @@ func hashLen0to16(s []byte) uint64 {
mul := k2 + slen*2
a := fetch32(s, 0)

return hashLen16Mul(slen+(uint64(a)<<3), uint64(fetch32(s, int(slen-4))), mul)
return hashLen16Mul(slen+(a<<3), fetch32(s, int(slen-4)), mul)
}

if slen > 0 {
Expand Down Expand Up @@ -196,8 +188,6 @@ func hashLen33to64(s []byte) uint64 {
func FarmHash64(s []byte) uint64 {
slen := len(s)

var seed uint64 = 81

if slen <= 16 {
return hashLen0to16(s)
}
Expand All @@ -210,8 +200,11 @@ func FarmHash64(s []byte) uint64 {
return hashLen33to64(s)
}

var seed uint64 = 81

// For strings over 64 bytes we loop.
// Internal state consists of 56 bytes: v, w, x, y, and z.

v := uint128{0, 0}
w := uint128{0, 0}
x := seed*k2 + fetch64(s, 0)
Expand Down
Loading
Loading