Skip to content

7 · delay — echoes, feedback & the ring buffer

Delay is the basis of at least half the audio effects you hear — echo, chorus, flanger, and even reverb are all built on it. The whole idea: mix a signal with a delayed copy of itself. The only real machinery is a buffer that remembers the past — a ring buffer.

The original runs in real time under JACK; offline we apply the same difference equations over a whole buffer. Identical DSP.

A single reflection: you hear the direct (dry) sound, then a quieter, delayed copy bounced off a wall. As a difference equation, with delay DD samples and gain gg:

y[n]=x[n]+gx[nD]y[n] = x[n] + g\,x[n-D]

The naive code reads the input DD samples ago directly:

Offline, where we hold the entire input, that translates almost literally:

const delay_samples: usize = 2000; // delay in samples (~45 ms at 44.1 kHz)
const gain: f32 = 0.5;
fn feedforward(in: []const f32, out: []f32) void {
for (out, 0..) |*o, i| {
const echoed: f32 = if (i >= delay_samples) in[i - delay_samples] else 0.0;
o.* = in[i] + gain * echoed;
}
}

Math note — it is a comb filter. y[n]=x[n]+gx[nD]y[n] = x[n] + g\,x[n-D] adds a signal to a delayed copy of itself. At frequencies where the copy lines up in phase, they reinforce; where it is out of phase, they cancel — producing a row of evenly spaced notches across the spectrum that look like a comb. (This is a finite impulse response / FIR filter: the output depends only on past inputs.)

Zig note — the start guard. if (i >= delay_samples) in[i - delay_samples] else 0.0 avoids indexing before the buffer start (an unsigned usize would wrap catastrophically). In real time you cannot reach back into a previous block at all — which is exactly why we need a ring buffer.

In real time each block only holds nframes samples, so to look D samples back we keep our own history in a fixed array whose index wraps around like a clock. Track one write pointer wp; the read pointer is always D behind it.

In Zig, wrapped in a struct that works the same offline or (conceptually) in real time:

const MAX_DELAY: usize = 1 << 17;
const Delay = struct {
buf: [MAX_DELAY]f32 = [_]f32{0} ** MAX_DELAY, // start silent
wp: usize = 0,
fn read(self: *Delay, d: usize) f32 {
// d samples behind the write pointer, wrapped
const rp = (self.wp + MAX_DELAY - d) % MAX_DELAY;
return self.buf[rp];
}
fn write(self: *Delay, x: f32) void {
self.buf[self.wp] = x;
self.wp = (self.wp + 1) % MAX_DELAY;
}
};

Zig note — [_]f32{0} ** MAX_DELAY. This builds an array of MAX_DELAY zeros at compile time (** repeats an array literal). Starting the buffer at zero matters: read uninitialized memory and you get a burst of noise on the first pass — the Zig array initializer is the equivalent of C’s calloc/memset reminder in the original. Computing rp with + MAX_DELAY before % keeps the arithmetic on unsigned usize from underflowing.

Changing gain or delay abruptly clicks. Gain is just a volume, so smooth it with the one-pole from mix. Delay time needs smoothing too — but first it must become a float, which means reading the buffer at a fractional position:

fn readFrac(self: *Delay, d: f32) f32 {
const pos = @as(f32, @floatFromInt(self.wp + MAX_DELAY)) - d;
const ipos: usize = @intFromFloat(pos);
const fr = pos - @as(f32, @floatFromInt(ipos));
const x0 = self.buf[ipos % MAX_DELAY];
const x1 = self.buf[(ipos + 1) % MAX_DELAY];
return (1.0 - fr) * x0 + fr * x1; // linear interpolation
}

Math note — why fractional, and the pitch-shift. A signal can be delayed by any real amount, not just whole samples, so to glide the delay time smoothly we read between stored samples and blend them — the same (1-fr)*x0 + fr*x1 interpolation as the wavetable in chapter 4. A side effect: while the delay time is changing, the read pointer moves at a different rate than the write pointer, which shifts the pitch — the classic “warble” of tape and bucket-brigade delays. Lovely or annoying, depending on taste.

Two reflecting surfaces instead of one: the sound bounces back and forth, each pass quieter, giving a train of decaying echoes. The change is famously one line — store the output instead of the input:

As difference equations, that is the whole difference:

feedforward: y[n]=x[n]+gx[nD]feedback: y[n]=x[n]+gy[nD]\text{feedforward: } y[n] = x[n] + g\,x[n-D] \qquad\qquad \text{feedback: } y[n] = x[n] + g\,y[n-D]

// feedforward: a single echo
fn echoOnce(self: *Delay, x: f32, d: usize, g: f32) f32 {
const y = x + g * self.read(d);
self.write(x); // store the INPUT
return y;
}
// feedback: repeating, decaying echoes
fn echoFeedback(self: *Delay, x: f32, d: usize, g: f32) f32 {
const y = x + g * self.read(d);
self.write(y); // store the OUTPUT — the one-line change
return y;
}

Math note — why feedback repeats, and the stability rule. Because the output is fed back in, an impulse comes out at 1,g,g2,g3,1, g, g^2, g^3, \dots spaced DD samples apart — a geometric series. If g<1|g| < 1 the echoes shrink and die away; if g1|g| \ge 1 they grow without bound and the signal explodes. So keep feedback below 1.0. (This is an infinite impulse response / IIR comb filter — output depends on past outputs.) A reverb is essentially several of these tuned and combined.

A “mix” knob blends the untouched (dry) signal with the processed (wet) one. With a single drywet in [0,1][0,1]:

fn mixDryWet(dry_sig: f32, wet_sig: f32, knob: f32) f32 {
const dry = 1.0 - knob; // knob = 0 → all dry
const wet = knob; // knob = 1 → all wet
return dry * dry_sig + wet * wet_sig;
}

Math note — the mix is a crossfade. (1-k)·dry + k·wet is the same weighted-average we used for sample interpolation and table crossfading — here fading between two whole signals. At k = 0.5 you get equal parts. (For feedback, note the dry/wet blend changes only what you hear, not what gets written back into the buffer — otherwise you would alter the feedback path itself.)

You now have the core of nearly every time/space effect:

  • Delay line = memory of the past (the ring buffer).
  • Feedback = output routed back to input → echo, and the seed of reverb.
  • Modulated fractional delay = chorus, flanger, vibrato (drive the delay time with a slow LFO).
  • Many delays + feedback + filtering = reverb.

Feedforward and feedback delays are also called comb filters because their frequency responses look like a comb — the bridge to filters and reverb.

  1. Render a short plucked note (saw + ADSR from chapters 2 & 5), then run the whole buffer through echoFeedback with g = 0.3, 0.6, 0.85. Allocate ~2 s of trailing silence so the tail can ring out.
  2. Tempo-sync the delay: at 120 BPM a quarter note is 0.5 s; set d = @intFromFloat(0.5 * sr) so echoes land on the beat.
  3. Make a chorus: a short delay (~15 ms) whose time is modulated by a 0.5 Hz Phasor (chapter 3) sine, read with readFrac, mixed ~50/50. Then a flanger: shorten the delay (~3 ms), add feedback, speed the LFO.
  4. Replace readFrac with a nearest-sample read and sweep the delay — hear the zipper noise that interpolation removes.

Original chapter with audio demos: mu.krj.st/delay. Julius O. Smith’s “Feedforward/Feedback Comb Filters” and “Delay-Line and Signal Interpolation” (CCRMA); the RealSimple Project’s “Time Varying Delay Effects”; xiph.org “Digital Show & Tell” (~20:00) for fractional delay in action.


That is the course. You can now synthesize and shape sound from first principles in Zig — waveforms, mixing, dynamics, and time-based effects — entirely offline, with code verified on Zig 0.16, and with the original C kept beside every port for comparison. Back to the index.