Skip to content

Commit

Permalink
clean-up in preparation for a version 0.4 release
Browse files Browse the repository at this point in the history
- add .gitignore file and build.sh and test.sh scripts (thanks hayguen)
- clear sample history on stretch_flush() as well as stretch_reset()
- update README and improve in-line code documentation
  • Loading branch information
dbry committed Oct 21, 2022
1 parent 252db62 commit b4c0038
Show file tree
Hide file tree
Showing 6 changed files with 189 additions and 39 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
output
audio-stretch
samples/*.wav

63 changes: 44 additions & 19 deletions README
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
////////////////////////////////////////////////////////////////////////////
// **** AUDIO-STRETCH **** //
// Time Domain Harmonic Scaler //
// Copyright (c) 2019 David Bryant //
// Copyright (c) 2022 David Bryant //
// All Rights Reserved. //
// Distributed under the BSD Software License (see license.txt) //
////////////////////////////////////////////////////////////////////////////
Expand Down Expand Up @@ -38,31 +38,56 @@ Note that unless ratios of exactly 0.5 or 2.0 are used with the -s option,
non-standard sampling rates will probably result. Many programs will still
properly play these files, and audio editing programs will likely import
them correctly (by resampling), but it is possible that some applications
will barf on them.

For version 0.2 a new option was added to cycle through the full possible
ratio range in a sinusoidal pattern, starting at 1.0, and either going
up (-c) or down (-cc) first. In this case any specified ratio is ignored
(except if the -s option is also specified to scale the sampling rate).
The total period is fixed at 2π seconds, at which point the output will
again be exactly aligned with the input.

To build the demo app:

$ gcc -O2 *.c -lm -o audio-stretch

The "help" display from the demo app:

AUDIO-STRETCH Time Domain Harmonic Scaling Demo Version 0.2
Copyright (c) 2019 David Bryant. All Rights Reserved.
will barf on them. They could also be resampled using an audio resampling
tool also available here: https://github.com/dbry/audio-resampler

There's an option to cycle through the full possible ratio range in a
sinusoidal pattern, starting at 1.0, and either going up (-c) or down
(-cc) first. In this case any specified ratio is ignored (except if the
-s option is also specified to scale the sampling rate). The total period
is fixed at 2π seconds, at which point the output will again be exactly
aligned with the input.

*** Version 0.4 Enhancements ***

For version 0.4 two useful features were added. First, the ability to
cascade two instances of the stretcher was added. This is enabled by
including the flag STRETCH_DUAL_FLAG when initializing the stretcher
and allows double the stretch ratio of the regular code (i.e., now 0.25X
to 4.00X). Note that the audio quality degrades some when slowed beyond
2X, and generally voice becomes unintelligible when sped faster than 2X,
however these values may still be useful for some applications, and
specifically the very high speed values are useful for silence gaps
(see the next feature).

The other feature added is the ability to detect silence gaps in the
audio and apply a different (likely lower) stretch ratio to these areas.
This is currently not performed in the library itself, but in the demo
command-line program where it is highly configurable, but it should be
relatively easy to copy the functionality into another application. If
I get requests for it, I will consider moving it into the library.

There is a script to build the demo app on Linux (build.sh), and this also
allows building the app to test for UB (undefined behavior) and ASAN (bad
addressing). Also, some artificial test signals (both mono and stereo) and
a script (test.sh) for running them at various ratios has been added.

The current "help" display from the demo app:

AUDIO-STRETCH Time Domain Harmonic Scaling Demo Version 0.4
Copyright (c) 2022 David Bryant. All Rights Reserved.

Usage: AUDIO-STRETCH [-options] infile.wav outfile.wav

Options: -r<n.n> = stretch ratio (0.5 to 2.0, default = 1.0)
Options: -r<n.n> = stretch ratio (0.25 to 4.0, default = 1.0)
-g<n.n> = gap/silence stretch ratio (if different)
-u<n> = upper freq period limit (default = 333 Hz)
-l<n> = lower freq period limit (default = 55 Hz)
-b<n> = audio buffer/window length (ms, default = 25)
-t<n> = gap/silence threshold (dB re FS, default = -40)
-c = cycle through all ratios, starting higher
-cc = cycle through all ratios, starting lower
-d = force dual instance even for shallow ratios
-s = scale rate to preserve duration (not pitch)
-f = fast pitch detection (default >= 32 kHz)
-n = normal pitch detection (default < 32 kHz)
Expand Down
18 changes: 18 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

if [ -z "$1" ] || [ "$1" = "rel" ]; then
echo "building release .."
gcc -Ofast main.c stretch.c -lm -o audio-stretch
elif [ "$1" = "dbg" ]; then
echo "building debug .."
gcc -O0 -g main.c stretch.c -lm -o audio-stretch
elif [ "$1" = "ubsan" ]; then
echo "building debug with undefined behaviour sanitizer .."
gcc -O0 -g main.c stretch.c -fsanitize=undefined -lm -o audio-stretch
elif [ "$1" = "asan" ]; then
echo "building debug with address sanitizer .."
gcc -O0 -g main.c stretch.c -fsanitize=address -lm -o audio-stretch
else
echo "error: unknown option '$1'"
fi

42 changes: 27 additions & 15 deletions stretch.c
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,6 @@
// and should contain approximately similar content.
// For independent channels, prefer using multiple StretchHandle-instances.
// see https://github.com/dbry/audio-stretch/issues/6
// Multiple instances, of course, will consume more CPU load.
// In addition, different output amounts need to be handled.


#include <stdio.h>
Expand Down Expand Up @@ -63,7 +61,12 @@ static int find_period (struct stretch_cnxt *cnxt, int16_t *samples);
* are specified here. The longest period determines the lowest fundamental frequency
* that can be handled correctly. Note that higher frequencies can be handled than the
* shortest period would suggest because multiple periods can be combined, and the
* worst-case performance will suffer if too short a period is selected.
* worst-case performance will suffer if too short a period is selected. The flags are:
*
* STRETCH_FAST_FLAG 0x1 Use the "fast" version of the period calculation
*
* STRETCH_DUAL_FLAG 0x2 Cascade two instances of the stretcher to expand
* available ratios to 0.25X to 4.00X
*/

StretchHandle stretch_init (int shortest_period, int longest_period, int num_channels, int flags)
Expand Down Expand Up @@ -131,8 +134,8 @@ void stretch_reset (StretchHandle handle)

/*
* Determine how many samples (per channel) should be reserved in 'output'-array
* for stretch_samples() and stretch_flush(). max_num_samples is the maximum for
* 'num_samples' when calling stretch_samples().
* for stretch_samples() and stretch_flush(). max_num_samples and max_ratio are the
* maximum values that will be passed to stretch_samples().
*/

int stretch_output_capacity (StretchHandle handle, int max_num_samples, float max_ratio)
Expand Down Expand Up @@ -165,12 +168,19 @@ int stretch_output_capacity (StretchHandle handle, int max_num_samples, float ma
}

/*
* Process the specified samples with the given ratio (which is clipped to the
* range 0.5 to 2.0). Note that the number of samples refers to total samples for
* both channels in stereo and can be as large as desired (samples are buffered
* here). The exact number of samples output is not possible to determine in
* advance, but the maximum will be the number of input samples times the ratio
* plus 3X the longest period (or 4X the longest period in "fast" mode).
* Process the specified samples with the given ratio (which is normally clipped to
* the range 0.5 to 2.0, or 0.25 to 4.00 for the "dual" mode). Note that in stereo
* the number of samples refers to the samples for one channel (i.e., not the total
* number of values passed) and can be as large as desired (samples are buffered here).
* The ratio may change between calls, but there is some latency to consider because
* audio is buffered here and a new ratio may be applied to previously sent samples.
*
* The exact number of samples output is not easy to determine in advance, so a function
* is provided (stretch_output_capacity()) that calculates the maximum number of samples
* that can be generated from a single call to this function (or stretch_flush()) given
* a number of samples and maximum ratio. It is reccomended that that function be used
* after initialization to allocate in advance the buffer size required. Be sure to
* multiply the return value by the number channels!
*/

int stretch_samples (StretchHandle handle, const int16_t *samples, int num_samples, int16_t *output, float ratio)
Expand Down Expand Up @@ -234,8 +244,6 @@ int stretch_samples (StretchHandle handle, const int16_t *samples, int num_sampl
else
period = cnxt->longest;

// printf ("%d\n", period / cnxt->num_chans);

/*
* Once we have calculated the best-match period, there are 4 possible transformations
* available to convert the input samples to output samples. Obviously we can simply
Expand Down Expand Up @@ -265,7 +273,7 @@ int stretch_samples (StretchHandle handle, const int16_t *samples, int num_sampl
if (ratio != 1.0)
cnxt->outsamples_error += (period * 2.0) - (period * 2.0 * ratio);
else
cnxt->outsamples_error = 0;
cnxt->outsamples_error = 0; /* if the ratio is 1.0, we can never cancel the error, so just do it now */

out_samples += period * 2;
cnxt->tail += period * 2;
Expand Down Expand Up @@ -345,7 +353,9 @@ int stretch_samples (StretchHandle handle, const int16_t *samples, int num_sampl

/*
* Flush any leftover samples out at normal speed. For cascaded dual instances this must be called
* twice to completely flush, or simply call it until it returns zero samples
* twice to completely flush, or simply call it until it returns zero samples. The maximum number
* of samples that can be returned from each call of this function can be determined in advance with
* stretch_output_capacity().
*/

int stretch_flush (StretchHandle handle, int16_t *output)
Expand All @@ -367,6 +377,8 @@ int stretch_flush (StretchHandle handle, int16_t *output)
}

cnxt->tail = cnxt->head;
memset (cnxt->inbuff, 0, cnxt->tail * sizeof (*cnxt->inbuff));

return samples_flushed;
}

Expand Down
8 changes: 3 additions & 5 deletions stretch.h
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
////////////////////////////////////////////////////////////////////////////
// **** AUDIO-STRETCH **** //
// Time Domain Harmonic Scaler //
// Copyright (c) 2019 David Bryant //
// Copyright (c) 2022 David Bryant //
// All Rights Reserved. //
// Distributed under the BSD Software License (see license.txt) //
////////////////////////////////////////////////////////////////////////////
Expand All @@ -19,16 +19,14 @@
// and should contain approximately similar content.
// For independent channels, prefer using multiple StretchHandle-instances.
// see https://github.com/dbry/audio-stretch/issues/6
// Multiple instances, of course, will consume more CPU load.
// In addition, different output amounts need to be handled.

#ifndef STRETCH_H
#define STRETCH_H

#include <stdint.h>

#define STRETCH_FAST_FLAG 0x1
#define STRETCH_DUAL_FLAG 0x2
#define STRETCH_FAST_FLAG 0x1 // use "fast" version of period determination code
#define STRETCH_DUAL_FLAG 0x2 // cascade two instances (doubles usable ratio range)

#ifdef __cplusplus
extern "C" {
Expand Down
93 changes: 93 additions & 0 deletions test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
#!/bin/bash

if [ ! -d output ]; then
echo "creating directory output"
mkdir output
fi

if [ ! -f samples/mono.wav ] || [ ! -f samples/stereo.wav ]; then
WVUNPACK=$(which wvunpack)
if [ -z "$WVUNPACK" ]; then
echo "please build/install WavPack with wvunpack to convert .wv samples to .wav"
exit 1
fi
$WVUNPACK samples/mono.wv
$WVUNPACK samples/stereo.wv
fi

STARTER=""
if [ "$1" = "gdb" ]; then
STARTER="gdb -q -ex run -ex quit --args"
shift
fi

EXAMPLE="mono"
if [ "$1" = "mono" ]; then
EXAMPLE="$1"
shift
fi
if [ "$1" = "stereo" ]; then
EXAMPLE="$1"
shift
fi


if [ -z "$1" ] && [ -z "$2" ]; then
echo "usage: $0 [mono|stereo] [f|n] [s|x]"
echo " 'f': fast pitch detection"
echo " 'n': normal pitch detection"
echo " 's': simple range for ratio: 0.5 .. 2.0"
echo " 'x': extended range for ratio: 0.25 .. 4.0"
echo ""
fi

if [ -z "$1" ] || [ "$1" = "f" ]; then
echo "testing with fast pitch detection"
FO="-f"
FN="f"
else
echo "testing with normal pitch detection"
FO="-n"
FN="n"
fi


if [ -z "$2" ] || [ "$2" = "s" ]; then
echo ""
echo "testing normal range 0.5 .. 2.0"
echo "x2.0"
$STARTER ./audio-stretch -q -y $FO -r0.5 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r050_x200.wav
echo "x1.75"
$STARTER ./audio-stretch -q -y $FO -r0.571 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r057_x175.wav
echo "x1.5"
$STARTER ./audio-stretch -q -y $FO -r0.666 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r066_x150.wav
echo "x1.25"
$STARTER ./audio-stretch -q -y $FO -r0.8 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r080_x125.wav
echo "x1.0"
$STARTER ./audio-stretch -q -y $FO -r1.0 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r100_x100.wav
echo "x0.75"
$STARTER ./audio-stretch -q -y $FO -r1.333 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r133_x075.wav
echo "x0.5"
$STARTER ./audio-stretch -q -y $FO -r2.0 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r200_x050.wav
fi
if [ -z "$2" ] || [ "$2" = "x" ]; then
echo ""
echo "testing extended range 0.25 .. 0.5 and 2.0 .. 4.0"
echo "x4.0"
$STARTER ./audio-stretch -q -y $FO -r0.25 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r025_x400.wav
echo "x3.5"
$STARTER ./audio-stretch -q -y $FO -r0.285 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r028_x350.wav
echo "x3.0"
$STARTER ./audio-stretch -q -y $FO -r0.333 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r033_x300.wav
echo "x2.5"
$STARTER ./audio-stretch -q -y $FO -r0.4 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r040_x250.wav
echo "x0.4"
$STARTER ./audio-stretch -q -y $FO -r2.5 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r250_x040.wav
echo "x0.333"
$STARTER ./audio-stretch -q -y $FO -r3.0 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r300_x033.wav
echo "x0.285"
$STARTER ./audio-stretch -q -y $FO -r3.5 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r350_x028.wav
echo "x0.25"
$STARTER ./audio-stretch -q -y $FO -r4.0 samples/${EXAMPLE}.wav output/out_${EXAMPLE}_${FN}_r400_x025.wav
fi

0 comments on commit b4c0038

Please sign in to comment.