Update Readme.md
This commit is contained in:
parent
b6dcb80a73
commit
a23a1525a2
320
Readme.md
320
Readme.md
@ -1,114 +1,312 @@
|
||||
# AAC Encoder/Decoder Assignment (Multimedia)
|
||||
# AAC Encoder/Decoder Assignment (Multimedia – AUTh)
|
||||
|
||||
## About
|
||||
## Overview
|
||||
|
||||
This repository contains a staged implementation of a simplified AAC-like audio coder/decoder pipeline, developed in the context of the Multimedia course at Aristotle University of Thessaloniki (AUTh).
|
||||
The project is organized into incremental levels, where each level introduces additional functionality and requirements (e.g., segmentation control, filterbanks, and progressively more complete encoding/decoding stages).
|
||||
The purpose of this work is to implement the specified processing chain faithfully to the assignment specification, validate correctness with structured tests, and maintain a clean, reproducible project structure throughout development.
|
||||
This repository contains a staged implementation of a simplified AAC-like audio encoder/decoder pipeline, developed in the context of the **Multimedia** course at Aristotle University of Thessaloniki (AUTh).
|
||||
|
||||
The project follows a progressive, level-based structure:
|
||||
|
||||
- **Level 1:** Core analysis/synthesis pipeline
|
||||
- **Level 2:** Full transform-domain encoding with quantization
|
||||
- **Level 3:** Psychoacoustic modeling and perceptual coding enhancements
|
||||
|
||||
The goal of this work is to:
|
||||
|
||||
- Faithfully implement the processing chain specified in the assignment
|
||||
- Validate correctness using structured and reproducible tests
|
||||
- Maintain a clean and reproducible project architecture
|
||||
- Ensure separation between development logic and submission packaging
|
||||
|
||||
---
|
||||
|
||||
## System Architecture
|
||||
|
||||
The implemented pipeline follows a simplified AAC-style structure:
|
||||
|
||||
```
|
||||
Input WAV
|
||||
↓
|
||||
SSC (Segmentation Control)
|
||||
↓
|
||||
Filterbank (MDCT)
|
||||
↓
|
||||
[TNS / Psychoacoustic Model] (Level 3)
|
||||
↓
|
||||
Quantization & Coding (Level 2+)
|
||||
↓
|
||||
Bitstream Structuring
|
||||
↓
|
||||
-----------------------------------------
|
||||
↓
|
||||
Inverse Quantization
|
||||
↓
|
||||
Inverse Filterbank (IMDCT)
|
||||
↓
|
||||
OLA Reconstruction
|
||||
↓
|
||||
Output WAV
|
||||
```
|
||||
|
||||
Each level progressively enables more blocks of this pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Repository Structure
|
||||
|
||||
The repository is organized into source code, project requirements and report files:
|
||||
The repository is organized into source code, material, and report files.
|
||||
|
||||
- `source`
|
||||
Under `source` directory there are:
|
||||
- `level_1` Containing the baseline implementation of the required processing chain for Level 1.
|
||||
- `level_2` Containing the baseline implementation of the required processing chain for Level 2.
|
||||
- `level_3` Containing the baseline implementation of the required processing chain for Level 3.
|
||||
```
|
||||
root/
|
||||
│
|
||||
├── source/
|
||||
│ ├── level_1/
|
||||
│ ├── level_2/
|
||||
│ ├── level_3/
|
||||
│ ├── core/
|
||||
│ └── material/
|
||||
│
|
||||
├── report/
|
||||
├── README.md
|
||||
└── LICENSE
|
||||
```
|
||||
|
||||
Each level contains:
|
||||
- a module file (e.g., `level_1/level_1.py`)
|
||||
- a dedicated `core/` directory
|
||||
- a dedicated `material/` directory
|
||||
### `source/`
|
||||
|
||||
- `core`
|
||||
This directory contains the actual implementation, which is referenced in each one of the `level_x/core/` directories.
|
||||
- `material`
|
||||
This directory contains the actual given helper material files
|
||||
Contains all implementation code.
|
||||
|
||||
- `report` Directory that contains the TeX files for the report
|
||||
- `root directory files` Like Readme.md, LICENSE, etc...
|
||||
#### `level_x/`
|
||||
|
||||
Each level directory contains:
|
||||
|
||||
### Notes on Repository structure and Development Workflow
|
||||
One of the project requirements was to deliver `level_x` directories containing all the necessary files, without referencing any other external files and libraries.
|
||||
This requirement introduces copies and is considered error-prone.
|
||||
In order to avoid that we centralized the development of the project inside `core` directory.
|
||||
Each level directory contains a references(hard-links) to the files of both `core` and `material` folders.
|
||||
This way we keep the instructor happy while avoiding the nightmare of code redundancy.
|
||||
- `level_x.py` (main module entry point)
|
||||
- `core/` (hard-links to shared implementation)
|
||||
- `material/` (hard-links to required helper material)
|
||||
- `tests/` (level-specific tests)
|
||||
|
||||
Each level is **self-contained** to satisfy submission requirements.
|
||||
|
||||
## Level Descriptions
|
||||
#### `core/`
|
||||
|
||||
### Level 1
|
||||
This directory contains the centralized implementation of:
|
||||
|
||||
**Goal:** Implement the core analysis/synthesis chain for Level 1 as defined in the assignment specification.
|
||||
- SSC
|
||||
- MDCT / IMDCT filterbank
|
||||
- Quantizer / dequantizer
|
||||
- Psychoacoustic model
|
||||
- TNS
|
||||
- Bitstream handling
|
||||
- Encoder/decoder pipelines
|
||||
|
||||
Implemented components (current status):
|
||||
- SSC (Sequence Segmentation Control)
|
||||
- Filterbank (MDCT analysis) and inverse filterbank (IMDCT synthesis)
|
||||
- End-to-end encoder/decoder functions:
|
||||
All development happens here.
|
||||
|
||||
Each `level_x/core/` directory references these files using **hard links**, ensuring:
|
||||
|
||||
- No code duplication
|
||||
- No synchronization errors
|
||||
- Clean development workflow
|
||||
|
||||
#### `material/`
|
||||
|
||||
Contains helper files provided by the assignment:
|
||||
|
||||
- Sample audio
|
||||
- Reference data
|
||||
- Required constants or auxiliary files
|
||||
|
||||
---
|
||||
|
||||
## Development Workflow Design
|
||||
|
||||
One of the project requirements was to deliver `level_x` directories containing all required files, without referencing external directories.
|
||||
|
||||
Naively copying files across levels would introduce:
|
||||
|
||||
- Code redundancy
|
||||
- High maintenance cost
|
||||
- Risk of inconsistencies
|
||||
- Debugging complexity
|
||||
|
||||
To avoid this:
|
||||
|
||||
- All implementation lives in `source/core/`
|
||||
- Each `level_x` directory contains hard-links to `core/` and `material/`
|
||||
|
||||
This ensures:
|
||||
|
||||
- Single source of truth
|
||||
- Clean modular structure
|
||||
- Instructor-compliant submission format
|
||||
- Safe iterative development
|
||||
|
||||
---
|
||||
|
||||
# Level Descriptions
|
||||
|
||||
---
|
||||
|
||||
## Level 1 – Core Transform Pipeline
|
||||
|
||||
### Goal
|
||||
|
||||
Implement the baseline transform-domain analysis/synthesis chain.
|
||||
|
||||
### Implemented Components
|
||||
|
||||
- Sequence Segmentation Control (SSC)
|
||||
- MDCT analysis filterbank
|
||||
- IMDCT synthesis filterbank
|
||||
- Overlap-Add (OLA) reconstruction
|
||||
- End-to-end encoder/decoder:
|
||||
- `aac_coder_1()`
|
||||
- `i_aac_coder_1()`
|
||||
- Demo function:
|
||||
- Demo:
|
||||
- `demo_aac_1()`
|
||||
|
||||
Tests (current status):
|
||||
- Module-level tests for SSC
|
||||
- Module-level tests for filterbank and inverse filterbank (including OLA-based reconstruction checks)
|
||||
- Internal consistency tests for MDCT/IMDCT
|
||||
- Module-level tests for `aac_coder_1` / `i_aac_coder_1`
|
||||
### Testing Coverage
|
||||
|
||||
### Level 2
|
||||
- SSC unit tests
|
||||
- MDCT / IMDCT correctness tests
|
||||
- Perfect reconstruction validation
|
||||
- OLA consistency tests
|
||||
- Encoder/decoder integration tests
|
||||
|
||||
**Goal:** ...
|
||||
This level ensures transform-domain correctness and signal integrity.
|
||||
|
||||
---
|
||||
|
||||
### Level 3
|
||||
## Level 2 – Quantization and Coding
|
||||
|
||||
**Goal:** ...
|
||||
### Goal
|
||||
|
||||
## How to Run
|
||||
Extend Level 1 by implementing transform-domain quantization and coding.
|
||||
|
||||
In order to run the demo functionality you should be inside the `source/level_x` directory.
|
||||
### Implemented Components
|
||||
|
||||
### Run Level 1 Demo
|
||||
- Scalar quantization
|
||||
- Dequantization
|
||||
- Basic bitstream formatting
|
||||
- Integration into encoder/decoder pipeline:
|
||||
- `aac_coder_2()`
|
||||
- `i_aac_coder_2()`
|
||||
- Demo:
|
||||
- `demo_aac_2()`
|
||||
|
||||
Run the Level 1 demo by providing an input WAV file and an output WAV file:
|
||||
### Validation
|
||||
|
||||
- SNR-based quality evaluation
|
||||
- Consistency tests between quantizer and inverse quantizer
|
||||
- End-to-end reconstruction tests
|
||||
|
||||
This level introduces compression and controlled signal degradation.
|
||||
|
||||
---
|
||||
|
||||
## Level 3 – Psychoacoustic Model & Perceptual Coding
|
||||
|
||||
### Goal
|
||||
|
||||
Incorporate perceptual modeling to improve compression efficiency.
|
||||
|
||||
### Implemented Components
|
||||
|
||||
- Psychoacoustic model
|
||||
- Masking threshold estimation
|
||||
- TNS (Temporal Noise Shaping)
|
||||
- Adaptive quantization
|
||||
- Full encoding/decoding pipeline:
|
||||
- `aac_coder_3()`
|
||||
- `i_aac_coder_3()`
|
||||
- Demo:
|
||||
- `demo_aac_3()`
|
||||
|
||||
### Validation
|
||||
|
||||
- Perceptual improvements compared to Level 2
|
||||
- Stability tests
|
||||
- End-to-end evaluation
|
||||
|
||||
This level approximates a simplified perceptual AAC-like encoder.
|
||||
|
||||
---
|
||||
|
||||
# How to Run
|
||||
|
||||
All commands assume you are inside:
|
||||
|
||||
```
|
||||
source/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Run Level Demo
|
||||
|
||||
Navigate to the desired level:
|
||||
|
||||
```
|
||||
cd source/level_x
|
||||
```
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
python -m level_1 <input.wav> <output.wav>
|
||||
python -m level_x <input.wav> <output.wav>
|
||||
```
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
python -m level_1 material/LicorDeCalandraca.wav material/LicorDeCalandraca_out.wav
|
||||
```
|
||||
The demo prints the overall SNR (in dB) between the original and reconstructed audio.
|
||||
|
||||
### How to Run Tests
|
||||
The demo prints:
|
||||
|
||||
Tests are written and can get executed using `pytest` and are organized per level.
|
||||
In order to run the demo functionality you should be inside the `source/` directory.
|
||||
- Overall SNR (dB)
|
||||
- Processing information
|
||||
|
||||
The repository includes a `pytest.ini` file inside the `source/` directory.
|
||||
This file explicitly sets the Python module search path so that imports such as the followings
|
||||
work consistently when running tests from the command line.
|
||||
---
|
||||
|
||||
# Running Tests
|
||||
|
||||
Tests are written using `pytest`.
|
||||
|
||||
A `pytest.ini` file is included in `source/` to ensure proper module resolution.
|
||||
|
||||
From inside `source/`:
|
||||
|
||||
From inside `source/`, run all tests:
|
||||
```bash
|
||||
pytest -v
|
||||
```
|
||||
|
||||
To run only `level_1/tests` or a specific test file:
|
||||
Run specific level tests:
|
||||
|
||||
```bash
|
||||
pytest -v level_1/tests
|
||||
pytest -v level_2/tests
|
||||
pytest -v level_3/tests
|
||||
```
|
||||
|
||||
Run a specific test file:
|
||||
|
||||
```bash
|
||||
pytest -v level_1/tests/test_SSC.py
|
||||
```
|
||||
|
||||
## Disclaimer
|
||||
---
|
||||
|
||||
This project was developed solely for educational purposes.
|
||||
It is provided "as is", without any express or implied warranties.
|
||||
# Reproducibility
|
||||
|
||||
- Python version: 3.x
|
||||
- Tests validated with `pytest`
|
||||
- No external dependencies beyond assignment requirements
|
||||
- Deterministic pipeline execution
|
||||
|
||||
---
|
||||
|
||||
# Disclaimer
|
||||
|
||||
This project was developed solely for educational purposes as part of the Multimedia course at AUTh.
|
||||
It is provided **"as is"**, without any express or implied warranties.
|
||||
The author assumes no responsibility for any misuse, data loss, security incidents, or damages resulting from the use of this software.
|
||||
This implementation should not be used in production environments.
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user