Christos Choutouridis a23a1525a2 Update Readme.md

2026-02-22 02:19:56 +02:00

6.2 KiB

Raw Blame History

AAC Encoder/Decoder Assignment (Multimedia – AUTh)

Overview

This repository contains a staged implementation of a simplified AAC-like audio encoder/decoder pipeline, developed in the context of the Multimedia course at Aristotle University of Thessaloniki (AUTh).

The project follows a progressive, level-based structure:

Level 1: Core analysis/synthesis pipeline
Level 2: Full transform-domain encoding with quantization
Level 3: Psychoacoustic modeling and perceptual coding enhancements

The goal of this work is to:

Faithfully implement the processing chain specified in the assignment
Validate correctness using structured and reproducible tests
Maintain a clean and reproducible project architecture
Ensure separation between development logic and submission packaging

System Architecture

The implemented pipeline follows a simplified AAC-style structure:

Input WAV
   ↓
SSC (Segmentation Control)
   ↓
Filterbank (MDCT)
   ↓
[TNS / Psychoacoustic Model]   (Level 3)
   ↓
Quantization & Coding          (Level 2+)
   ↓
Bitstream Structuring
   ↓
-----------------------------------------
   ↓
Inverse Quantization
   ↓
Inverse Filterbank (IMDCT)
   ↓
OLA Reconstruction
   ↓
Output WAV

Each level progressively enables more blocks of this pipeline.

Repository Structure

The repository is organized into source code, material, and report files.

root/
│
├── source/
│   ├── level_1/
│   ├── level_2/
│   ├── level_3/
│   ├── core/
│   └── material/
│
├── report/
├── README.md
└── LICENSE

`source/`

Contains all implementation code.

`level_x/`

Each level directory contains:

level_x.py (main module entry point)
core/ (hard-links to shared implementation)
material/ (hard-links to required helper material)
tests/ (level-specific tests)

Each level is self-contained to satisfy submission requirements.

`core/`

This directory contains the centralized implementation of:

SSC
MDCT / IMDCT filterbank
Quantizer / dequantizer
Psychoacoustic model
TNS
Bitstream handling
Encoder/decoder pipelines

All development happens here.

Each level_x/core/ directory references these files using hard links, ensuring:

No code duplication
No synchronization errors
Clean development workflow

`material/`

Contains helper files provided by the assignment:

Sample audio
Reference data
Required constants or auxiliary files

Development Workflow Design

One of the project requirements was to deliver level_x directories containing all required files, without referencing external directories.

Naively copying files across levels would introduce:

Code redundancy
High maintenance cost
Risk of inconsistencies
Debugging complexity

To avoid this:

All implementation lives in source/core/
Each level_x directory contains hard-links to core/ and material/

This ensures:

Single source of truth
Clean modular structure
Instructor-compliant submission format
Safe iterative development

Level Descriptions

Level 1 – Core Transform Pipeline

Goal

Implement the baseline transform-domain analysis/synthesis chain.

Implemented Components

Sequence Segmentation Control (SSC)
MDCT analysis filterbank
IMDCT synthesis filterbank
Overlap-Add (OLA) reconstruction
End-to-end encoder/decoder:
- aac_coder_1()
- i_aac_coder_1()
Demo:
- demo_aac_1()

Testing Coverage

SSC unit tests
MDCT / IMDCT correctness tests
Perfect reconstruction validation
OLA consistency tests
Encoder/decoder integration tests

This level ensures transform-domain correctness and signal integrity.

Level 2 – Quantization and Coding

Goal

Extend Level 1 by implementing transform-domain quantization and coding.

Implemented Components

Scalar quantization
Dequantization
Basic bitstream formatting
Integration into encoder/decoder pipeline:
- aac_coder_2()
- i_aac_coder_2()
Demo:
- demo_aac_2()

Validation

SNR-based quality evaluation
Consistency tests between quantizer and inverse quantizer
End-to-end reconstruction tests

This level introduces compression and controlled signal degradation.

Level 3 – Psychoacoustic Model & Perceptual Coding

Goal

Incorporate perceptual modeling to improve compression efficiency.

Implemented Components

Psychoacoustic model
Masking threshold estimation
TNS (Temporal Noise Shaping)
Adaptive quantization
Full encoding/decoding pipeline:
- aac_coder_3()
- i_aac_coder_3()
Demo:
- demo_aac_3()

Validation

Perceptual improvements compared to Level 2
Stability tests
End-to-end evaluation

This level approximates a simplified perceptual AAC-like encoder.

How to Run

All commands assume you are inside:

source/

Run Level Demo

Navigate to the desired level:

cd source/level_x

Run:

python -m level_x <input.wav> <output.wav>

Example:

python -m level_1 material/LicorDeCalandraca.wav material/LicorDeCalandraca_out.wav

The demo prints:

Overall SNR (dB)
Processing information

Running Tests

Tests are written using pytest.

A pytest.ini file is included in source/ to ensure proper module resolution.

From inside source/:

pytest -v

Run specific level tests:

pytest -v level_1/tests
pytest -v level_2/tests
pytest -v level_3/tests

Run a specific test file:

pytest -v level_1/tests/test_SSC.py

Reproducibility

Python version: 3.x
Tests validated with pytest
No external dependencies beyond assignment requirements
Deterministic pipeline execution

Disclaimer

This project was developed solely for educational purposes as part of the Multimedia course at AUTh. It is provided "as is", without any express or implied warranties. The author assumes no responsibility for any misuse, data loss, security incidents, or damages resulting from the use of this software. This implementation should not be used in production environments.

All work, modifications, and results are the sole responsibility of the author.

6.2 KiB Raw Blame History Unescape Escape

AAC Encoder/Decoder Assignment (Multimedia – AUTh)

Overview

System Architecture

Repository Structure

source/

level_x/

core/

material/

Development Workflow Design

Level Descriptions

Level 1 – Core Transform Pipeline

Goal

Implemented Components

Testing Coverage

Level 2 – Quantization and Coding

Goal

Implemented Components

Validation

Level 3 – Psychoacoustic Model & Perceptual Coding

Goal

Implemented Components

Validation

How to Run

Run Level Demo

Running Tests

Reproducibility

Disclaimer

6.2 KiB

Raw Blame History

`source/`

`level_x/`

`core/`

`material/`