Skip to content

hqrrr/PerceptoMap

Repository files navigation

PerceptoMap

Downloads  Release   Platforms  License  DOI

Visualizing how we hear — from spectrograms to perception

PerceptoMap is an open-source audio plugin (VST3) that visualizes psychoacoustic features of audio signals in real time. Built with JUCE, it's designed to help you see how we perceive sound — not just how it looks on a frequency plot.

Unlike typical spectrum or spectrogram analyzers, it supports perceptual visualizations such as Mel spectrograms, Mel-frequency cepstral coefficients (MFCCs), Chromagram, and soon also Tempogram etc., offering insight into how humans perceive sound.

🎧 If you're the kind of creator who trusts your ears above all — you might not need this.
But if you're curious about how your audio measures up to what humans actually hear… welcome aboard.

Quick Navigation

Key Features

  • Real-time Mel Spectrogram display with perceptual frequency scaling
  • Real-time Mel-frequency cepstral coefficients (MFCCs) representing timbral texture and spectral envelope
  • Real-time Spectral Centroid tracking to visualize spectral brightness (center of mass of STFT spectrum)
  • Real-time Chromagram showing the energy distribution across the 12 pitch classes (C to B), regardless of octave. [added in v0.5]
  • Time-Frequency Reassignment mode (Linear+) for enhanced STFT resolution, based on the paper [hal-00414583: Time-Frequency reassignment: from principles to algorithms]. This mode sharpens the localization of spectral peaks by reassigning energy to more accurate time-frequency coordinates, making harmonic structures and transient details clearer compared to the standard STFT. [added in v0.6]
  • Time-Frequency Reassigned Mel Spectrogram mode (Mel+) - Mel-scaled display using the same time-frequency reassignment principle as Linear+. It computes the reassigned frequency from the complex STFT and then projects energy onto the Mel axis, yielding sharper harmonic ridges and crisper transients than a standard Mel spectrogram. [added in v0.7]
  • Visual analysis of Tempogram and other psychoacoustic features (planned)
  • Configurable color maps
  • Adjustable brightness gain and enhanced colormap modes to improve visibility of fine details in the spectrogram [added in v0.4]
  • Optional dB scaling, log or linear frequency axis for classic linear STFT spectrogram
  • Freeze frame mode and interactive mouse hover to inspect frequency and loudness at any point
  • Configurable FFT size for balancing time resolution and frequency resolution.[added in v0.6]
  • Independent scroll speed control, allowing smooth visualization at different FFT sizes and overlap settings without distorting the spectral data. [added in v0.6]
  • Adjustable y-axis frequency range. [added in v0.8]

Screenshots

Resizable GUI Small
Resizable GUI
Window can shrink to fit minimal layout
Resizable GUI Large
Resizable GUI
Window expands for detailed viewing

Classic Colormap
Classic
Default high-contrast mapping
Grayscale Colormap
Grayscale
Neutral luminance-based display
Grayscale Colormap
Magma
Perceptually uniform, dark background

Hover Frequency Readout Hover Readout
Displays precise frequency, dB level, and corresponding MIDI note (C4 = Middle C) [added in v0.3] under the mouse pointer.
Hover Frequency Readout Adjustable dB Floor Slider
Controls the minimum dB threshold for color brightness, helping visualize low-level signals. [added in v0.3]

Hover Frequency Readout Adjustable Brightness Gain
The Norm Factor slider allows manual control over spectrogram brightness, helping to adapt the display to signals with different loudness levels. [added in v0.4]
Hover Frequency Readout Fine Detail with Enhanced Colormap Modes
With non-linear color legend to enhance contrast, making subtle details more visible: Magma+ & Grayscale+. [added in v0.4]
Time–Frequency Reassignment Mode Linear+/Mel+: Time–Frequency Reassignment Mode
Sharpens the time–frequency localization of spectral peaks by reassigning energy to more accurate coordinates. Harmonic structures and transients become more clearly defined, compared to standard STFT. [added in v0.6/v0.7]

Linear STFT spectrogram with linear frequency axis
Linear STFT spectrogram with linear frequency axis: Displays physical frequency content directly.
Linear STFT spectrogram with log frequency axis
Linear STFT spectrogram with log frequency axis: Approximates human pitch perception. Emphasizes low-frequency resolution and compresses high-frequency bands.
Mel-scaled STFT spectrogram
Mel-scaled STFT spectrogram: Reflects the nonlinear frequency resolution of human hearing. Provides a more perceptually accurate representation than simple log-scaling.
Mel-frequency cepstral coefficient
Mel-frequency cepstral coefficients (MFCCs): Captures the spectral envelope using Discrete Cosine Transform (DCT) over Mel energies. Represents timbral texture and vocal tract shape. [added in v0.2]
blank
Spectral Centroid: Indicates the "center of mass" of the spectrum. Tracks brightness and perceptual sharpness by showing where the dominant frequencies are concentrated over time. [added in v0.3]
blank
Chromagram: Projects spectral energy onto the 12 pitch classes (C, C#, D, …), regardless of octave. Useful for analyzing harmony, key, and chord structures. [added in v0.5]

Back to top ↥

Roadmap

Feature Status Description Implementation Details
Linear STFT Spectrogram ✅ Done (v0.1) Classic time–frequency analysis Hann window, with log/linear frequency axis display and adjustable FFT size & scroll speed [added in v0.6]
Mel-Spectrogram ✅ Done (v0.1) Nonlinear frequency scaling approximating human pitch perception 128 bands, Slaney-style: 2595 * log10(1 + f / 700)
MFCC ✅ Done (v0.2) Mel frequency cepstral coefficients, compact representation of timbre based on perceptual log-mel spectrum DCT-II on log-mel spectrum, 20 coefficients, no liftering, values clipped to [−100, 100] and normalized to [0, 1] for display
Spectral Centroid (STFT-based) ✅ Done (v0.3) Tracks the "center of mass" of the spectrum; correlates with brightness and sharpness Computed from linear STFT magnitude spectrum with smoothing (Exponential Moving Average), overlaid as a curve on the STFT spectrogram
Adjustable brightness gain and enhanced colormap modes ✅ Done (v0.4) To improve visibility of fine details in the spectrogram Brightness remapped using non-linear scaling; norm factor slider controls global dB scaling, color maps applied after brightness normalization
Chroma ✅ Done (v0.5) Pitch class mapping, projection of spectral energy onto 12 pitch classes (C, C#, D…) Triangular chroma filter bank built from STFT bins, 12 overlapping filters per octave; energy mapped to pitch classes regardless of octave; supports smooth pitch transitions and partial overlaps
Enhanced STFT with Time–Frequency Reassignment (Linear+) ✅ Done (v0.6) Sharper time–frequency localization by reassigning each STFT bin’s energy to more accurate time/frequency coordinates Based on [hal-00414583: Time-Frequency reassignment: from principles to algorithms], implemented with Gaussian-window STFT and its time & frequency derivatives. Instantaneous frequency and group delay estimates are used to re-map spectral energy, improving localization of transients and harmonics compared to standard STFT. Supports same FFT size and log/linear axis options as Linear mode
Enhanced Mel Spectrogram with Time–Frequency Reassignment (Mel+) ✅ Done (v0.7) Mel-scaled spectrogram with sharper harmonic ridges and crisper transients by reassigning each STFT bins energy to its true instantaneous frequency, then projecting onto the Mel axis Based on the same reassignment principle as Linear+. Mapped to Mel.
Y-axis Range Control ✅ Done (v0.8) Precise control over visible frequency band Dual-handle range slider + editable min/max fields
Tempogram / Rhythm Map ⏳ Planned Visualizes perceived tempo and rhythmic periodicities over time -
Spectral Flatness / Contrast ⏳ Planned Measures of timbral characteristics -

Back to top ↥

Why develop this plugin?

In the fields of psychoacoustics, and machine learning, perceptually inspired representations such as Mel spectrograms and MFCCs are widely used — for example, in music genre classification, emotion recognition, or detecting AI-generated audio.

As a frequent user of tools like librosa in Python, while learning about DAWs, I was surprised to find that most DAWs seem to lack real-time, perceptually grounded visualization tools.

So I decided to build one — a lightweight, JUCE-based plugin that brings these powerful analysis tools directly into the DAW environment, where musicians, sound designers, and researchers can explore them interactively.

Back to top ↥

How to install?

You can download the latest version of PerceptoMap from the Releases page.

Available Format

  • VST3 (.vst3)

Plugin installation paths

Windows

  1. Download the plugin .zip file from the Releases
  2. Unzip and copy .vst3 plugin folder to the default system VST3 directory: C:\Program Files\Common Files\VST3\

Note: If you use a custom VST3 plugin path, copy it there instead.

  1. Launch your DAW and run a plugin rescan if necessary
  2. Then, you should be able to find the plugin under hqrrr - PerceptoMap

macOS (Intel / Apple Silicon)

  1. Download the plugin .zip file from the Releases:
    • macOS_x64 for Intel
    • macOS_arm for Apple Silicon
  2. Unzip and copy the .vst3 plugin folder to the default system VST3 directory:
    /Library/Audio/Plug-ins/VST3

Note: If you use a custom VST3 plugin path, copy it there instead.

  1. Launch your DAW and rescan plugins if needed
  2. Then, you should be able to find the plugin under hqrrr - PerceptoMap

Linux

  1. Download the plugin .zip file from the Releases
  2. Unzip and copy .vst3 plugin folder to your VST3 directory
  3. Launch your DAW and rescan plugins if needed
  4. Then, you should be able to find the plugin under hqrrr - PerceptoMap

Back to top ↥

Build Instructions for Developers

Prerequisites

  • JUCE 8.x (automatically fetched via CMake)
  • C++17 compatible compiler, e.g. Visual Studio 2022 (Windows)
  • CMake 3.22+

Build with CMake

Based on JUCE CMake Plugin Template.

On Windows (Visual Studio 2022)

  1. Open the project root in Visual Studio (choose "Open a local folder").
  2. Visual Studio will automatically detect the CMakeLists.txt.
  3. Select a CMake target configuration (e.g. x64-Release).
  4. In the CMake Targets View: PerceptoMap/PerceptoMap Project/Targets, right-click PerceptoMap_VST3 and click Build.
  5. The plugin binary will be placed in the build output directory: out/build/x64-Release/VST3/PerceptoMap.vst3

On Windows (Terminal)

cd path\to\PerceptoMap
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release

If using Ninja

cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build build

Folder Structure

PerceptoMap/
├── _pics/               -> Screenshots and images for documentation
├── Source/              -> Main plugin source code
├── CMakeLists.txt       -> Main build configuration (CMake-based)
├── CMakeSettings.json   -> (Optional) Visual Studio CMake config
├── README.md            -> Project documentation
└── LICENSE              -> AGPLv3 license file (required for JUCE open-source usage)

Back to top ↥

License & Cost

PerceptoMap is proudly open-source and completely free to use, modify, and redistribute under the terms of the GNU AGPLv3 License.

There are no hidden fees, paid versions, or limitations — the plugin is intended to be a community-driven tool for perceptual audio analysis and creative exploration.

I do not ask for donations — what matters more is your feedback, feature ideas, or even better: your involvement in development.

Ways you can contribute:

  • 🐞 Report bugs or issues you encounter
  • 💡 Suggest improvements or new perceptual features
  • 🔧 Submit pull requests to improve code or documentation
  • 📢 Share the plugin with others who may find it useful

Feel free to leave a comment — bug reports, feature ideas, or just thoughts are always welcome.

Back to top ↥

How to cite

If you use PerceptoMap in academic work, please cite a tagged release.

@software{PerceptoMap_Huang,
  author  = {Qirui Huang},
  title   = {PerceptoMap},
  subtitle= {VST3 spectrogram and psychoacoustic visualizer},
  version = {v0.8},
  date    = {2025},
  url     = {https://github.com/hqrrr/PerceptoMap},
  note    = {GitHub repository},
  doi     = {10.5281/zenodo.16923138}
}

Replace version (incl. release date) with the exact version you used.

Back to top ↥