Audiobook Spec Check — Noise Floor & ACX Compliance

◈ Drop your audio file here WAV · MP3 · M4A · FLAC · AIFF — up to 600MB

Baseline mode — where should NarScan measure your normal noise floor?

Use the last N seconds as a room tone reference

Best if your session ends with silence or room tone after mastering. Adjust the sample window slider to match how much tail you typically leave.

→ Best for: narrators who leave room tone at the chapter tail

Sample window 3.0s

How many seconds to borrow from the start or end of your file.

Drop your room tone file here, or click to browse

Threshold 3.0×

Min flag duration 500ms

Uploading…

Scan Results

Duration

—

Baseline noise floor

—

Issues found

—

Threshold used

—

ACX compliance

RMS loudness

—

Target: -23 to -18 dB

Peak level

—

Target: -3 dBFS max

Noise floor

—

Target: -60 dB or lower

Noise floor timeline — flagged sections in red · baseline in amber

Flagged sections

How Audiobook Spec Check works

Audiobook Spec Check listens for sections in your mastered audio where the background noise is louder than it should be. This usually happens when normalization amplifies a blank gap that has no real room tone behind it — a cut that didn't get filled, a silence that wasn't attached to anything. The listener hears it as a burst of static or hiss between sentences. Spec Check finds it before your client or platform does.

Drop your mastered file

Upload the final mastered version of your chapter — the file you'd submit to ACX, Findaway, or your client. WAV, MP3, M4A, FLAC, and AIFF are all supported.

Non-WAV files are converted internally before scanning. Your original file is never modified.

Choose your baseline mode

NarScan needs a sample of what your room tone should sound like — a few seconds of quiet background from the same recording session, after mastering.

Pick the mode that matches your workflow. See the baseline guide below for help choosing.

The baseline must be from the same mastering chain as your content. A room tone sample processed differently than your chapter will give inaccurate results.

Set your threshold and duration

Threshold (3.0×) — how much louder than baseline a section must be to get flagged. At 3.0×, a section needs to be 3 times louder than your normal room tone. Lower = more sensitive.

Min duration (500ms) — shortest anomaly worth reporting. Prevents brief breath sounds or transients from cluttering the results.

Read the results

The timeline shows your file's noise floor at a glance. Grey bars are quiet non-speech sections. Red sections are anomalies. The indigo dashed line is your baseline.

The table lists each flagged section with its timestamp, duration, dB level, and severity. Take those timestamps into your DAW, listen, and fix what's real.

Severity 10× or higher is audible static — fix it. Under 4× at a 3.0 threshold is borderline — use your ears before acting.

Which baseline mode should I use?

End of file

Use the tail

Borrows the last N seconds as its reference. Adjust the sample window slider to match how much room tone you typically leave at the end.

→ Best for: narrators who leave room tone or silence at the chapter tail

Start of file

Use the head

Borrows the first N seconds instead. Use this if you record a few seconds of silence before you start speaking, or if your tail has music or outro content.

→ Best for: narrators who record room tone at the top of the session

Separate file

Upload room tone

A dedicated file from the same session, after mastering. No guesswork — most accurate baseline possible, independent of file length or structure.

→ Best for: critical submissions, production studios, or when head/tail aren't reliable

Glossary

Noise floor

The level of background sound when no one is speaking — your room's ambient hiss, hum, and air handling. Every recording has one. The problem is when it varies unexpectedly within the same file.

Room tone

A recording of your room's silence. Captured at the start or end of a session, it's used to fill edited gaps so the background sounds consistent throughout the chapter.

Normalization

A mastering step that raises the volume of your whole file to hit a target level (e.g. -23 LUFS for ACX). It raises everything uniformly — including any gaps that have no room tone behind them, which become audible as raised hiss or static.

Baseline

The RMS level NarScan uses as "normal" for your file. All non-speech sections are compared against it. A section at 5× severity means its RMS is 5 times louder than your baseline.

RMS

Root Mean Square — a measure of average audio level over a short window. More reliable than peak levels for detecting noise, because noise is continuous rather than momentary.

Severity ratio

How many times louder a flagged section is compared to your baseline. 3× is the minimum threshold. 10× or higher is audible static that will almost certainly fail a platform QC check.

Threshold

The multiplier at which NarScan flags a section. At 3.0×, a quiet section must be 3 times louder than your baseline to appear in the report. Lower the threshold if you're missing real problems; raise it if you're getting too many false flags.

Min duration

The shortest anomaly worth reporting, in milliseconds. At 500ms, a noise spike shorter than half a second won't appear in the results. Raise this if breath sounds or transients are cluttering your report.