Multimedia_AAC_Project/Readme.md

# AAC Encoder/Decoder Assignment (Multimedia – AUTh)

## Overview

This repository contains a staged implementation of a simplified AAC-like audio encoder/decoder pipeline, developed in the context of the **Multimedia** course at Aristotle University of Thessaloniki (AUTh).

The project follows a progressive, level-based structure:

- **Level 1:** Core analysis/synthesis pipeline
- **Level 2:** Full transform-domain encoding with quantization
- **Level 3:** Psychoacoustic modeling and perceptual coding enhancements

The goal of this work is to:

- Faithfully implement the processing chain specified in the assignment
- Validate correctness using structured and reproducible tests
- Maintain a clean and reproducible project architecture
- Ensure separation between development logic and submission packaging

---

## System Architecture

The implemented pipeline follows a simplified AAC-style structure:

```
Input WAV
   ↓
SSC (Segmentation Control)
   ↓
Filterbank (MDCT)
   ↓
[TNS / Psychoacoustic Model]   (Level 3)
   ↓
Quantization & Coding          (Level 2+)
   ↓
Bitstream Structuring
   ↓
-----------------------------------------
   ↓
Inverse Quantization
   ↓
Inverse Filterbank (IMDCT)
   ↓
OLA Reconstruction
   ↓
Output WAV
```

Each level progressively enables more blocks of this pipeline.

---

## Repository Structure

The repository is organized into source code, material, and report files.

```
root/
│
├── source/
│   ├── level_1/
│   ├── level_2/
│   ├── level_3/
│   ├── core/
│   └── material/
│
├── report/
├── README.md
└── LICENSE
```

### `source/`

Contains all implementation code.

#### `level_x/`

Each level directory contains:

- `level_x.py` (main module entry point)
- `core/` (hard-links to shared implementation)
- `material/` (hard-links to required helper material)
- `tests/` (level-specific tests)

Each level is **self-contained** to satisfy submission requirements.

#### `core/`

This directory contains the centralized implementation of:

- SSC
- MDCT / IMDCT filterbank
- Quantizer / dequantizer
- Psychoacoustic model
- TNS
- Bitstream handling
- Encoder/decoder pipelines

All development happens here.

Each `level_x/core/` directory references these files using **hard links**, ensuring:

- No code duplication
- No synchronization errors
- Clean development workflow

#### `material/`

Contains helper files provided by the assignment:

- Sample audio
- Reference data
- Required constants or auxiliary files

---

## Development Workflow Design

One of the project requirements was to deliver `level_x` directories containing all required files, without referencing external directories.

Naively copying files across levels would introduce:

- Code redundancy
- High maintenance cost
- Risk of inconsistencies
- Debugging complexity

To avoid this:

- All implementation lives in `source/core/`
- Each `level_x` directory contains hard-links to `core/` and `material/`

This ensures:

- Single source of truth
- Clean modular structure
- Instructor-compliant submission format
- Safe iterative development

---

# Level Descriptions

---

## Level 1 – Core Transform Pipeline

### Goal

Implement the baseline transform-domain analysis/synthesis chain.

### Implemented Components

- Sequence Segmentation Control (SSC)
- MDCT analysis filterbank
- IMDCT synthesis filterbank
- Overlap-Add (OLA) reconstruction
- End-to-end encoder/decoder:
  - `aac_coder_1()`
  - `i_aac_coder_1()`
- Demo:
  - `demo_aac_1()`

### Testing Coverage

- SSC unit tests
- MDCT / IMDCT correctness tests
- Perfect reconstruction validation
- OLA consistency tests
- Encoder/decoder integration tests

This level ensures transform-domain correctness and signal integrity.

---

## Level 2 – Quantization and Coding

### Goal

Extend Level 1 by implementing transform-domain quantization and coding.

### Implemented Components

- Scalar quantization
- Dequantization
- Basic bitstream formatting
- Integration into encoder/decoder pipeline:
  - `aac_coder_2()`
  - `i_aac_coder_2()`
- Demo:
  - `demo_aac_2()`

### Validation

- SNR-based quality evaluation
- Consistency tests between quantizer and inverse quantizer
- End-to-end reconstruction tests

This level introduces compression and controlled signal degradation.

---

## Level 3 – Psychoacoustic Model & Perceptual Coding

### Goal

Incorporate perceptual modeling to improve compression efficiency.

### Implemented Components

- Psychoacoustic model
- Masking threshold estimation
- TNS (Temporal Noise Shaping)
- Adaptive quantization
- Full encoding/decoding pipeline:
  - `aac_coder_3()`
  - `i_aac_coder_3()`
- Demo:
  - `demo_aac_3()`

### Validation

- Perceptual improvements compared to Level 2
- Stability tests
- End-to-end evaluation

This level approximates a simplified perceptual AAC-like encoder.

---

# How to Run

All commands assume you are inside:

```
source/
```

---

## Run Level Demo

Navigate to the desired level:

```
cd source/level_x
```

Run:

```bash
python -m level_x <input.wav> <output.wav>
```

Example:

```bash
python -m level_1 material/LicorDeCalandraca.wav material/LicorDeCalandraca_out.wav
```

The demo prints:

- Overall SNR (dB)
- Processing information

---

# Running Tests

Tests are written using `pytest`.

A `pytest.ini` file is included in `source/` to ensure proper module resolution.

From inside `source/`:

```bash
pytest -v
```

Run specific level tests:

```bash
pytest -v level_1/tests
pytest -v level_2/tests
pytest -v level_3/tests
```

Run a specific test file:

```bash
pytest -v level_1/tests/test_SSC.py
```

---

# Reproducibility

- Python version: 3.x
- Tests validated with `pytest`
- No external dependencies beyond assignment requirements
- Deterministic pipeline execution

---

# Disclaimer

This project was developed solely for educational purposes as part of the Multimedia course at AUTh.
It is provided **"as is"**, without any express or implied warranties.
The author assumes no responsibility for any misuse, data loss, security incidents, or damages resulting from the use of this software.
This implementation should not be used in production environments.

All work, modifications, and results are the sole responsibility of the author.