Composite Pattern in PostgreSQL: Building Polymorphism Data Models for WebScale Apps

4 min readJan 8, 2025

In software engineering, the Composite Pattern is a structural design pattern that allows you to compose objects into tree structures to represent part-whole hierarchies. It provides a uniform interface for interacting with both individual objects and compositions of objects, eliminating redundancy in data and operations.

While the classic composite pattern is typically applied within in-memory programming environments dealing with transient data, many modern applications require handling persistent, data-intensive hierarchies. This is especially true for domains like business analytics, intelligence, and authorization models, where hierarchical, polymorphism data structures are fundamental. We’ll explore implementing the Composite Pattern using different implementation strategies in PostgreSQL.

Intent

The primary goal of the Composite Pattern is to compose entities into tree structures that represent part-whole hierarchies, providing a uniform interface across the hierarchy without redundant data or operations. Unlike traditional implementations that focus on in-memory structures, our approach leverages PostgreSQL’s capabilities to manage hierarchical data persistently, ensuring consistency in data-intensive applications.

Motivation

In data-intensive applications often require aggregating metrics from similar entities and enabling users to build complex data models to define operational semantics.

Consider building a data model for a SaaS storage authorization system. The service offers various subscription tiers — Basic, Premium, and Enterprise — each imposing different limits on bucket sizes and the number of files per tenant. Administrators should also be able to impose similar constraints. Additionally, the system needs to efficiently list files irrespective of whether they reside within a directory, with pricing based on the number of files and their sizes. As a transparent company, providing the cost per file is essential for user trust and clarity.

Storage Authorization Model

File: Represents individual files with attributes like id, size, content_type, tenant_id, and parent_id.
Directory: Inherits from File and includes additional attributes like number_of_files, size_limit, and files_limit. It also provides methods to sum_size() and count_files().

Implementing the Composite Pattern with PostgreSQL Table Inheritance

Different database systems offer varying support for table inheritance. For illustration, we’ll use PostgreSQL, which provides robust inheritance features within its SQL-based engine. This allows us to implement the Composite Pattern effectively for persistent data structures.

Setting Up the Schema

First, we’ll create a schema and define the base and derived tables using PostgreSQL’s inheritance:

DROP SCHEMA IF EXISTS "learn" CASCADE;
CREATE SCHEMA IF NOT EXISTS "learn";

CREATE TABLE "learn"."files" (
    "id" uuid NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
    "name" text NOT NULL,
    "size" double precision NOT NULL CHECK (size >= 0) DEFAULT 0,
    "content_type" text NOT NULL DEFAULT 'pdf'::text,
    "cost" double precision DEFAULT 0,
    "tenant_id" integer NOT NULL,
    "parent_id" uuid
);

CREATE TABLE "learn"."directories" (
    "id" uuid NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
    "number_of_files" integer DEFAULT 0,
    "files_limit" double precision DEFAULT 10,
    "size_limit" double precision DEFAULT 10
) INHERITS ("learn"."files");

CREATE TABLE "learn"."links" (
    "id" uuid NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
    "source_directory" uuid REFERENCES "learn"."directories"(id),
    "source_file" uuid REFERENCES "learn"."files"(id),
    "destination_directory" uuid REFERENCES "learn"."directories"(id),
    "destination_file" uuid REFERENCES "learn"."files"(id),
    "size" double precision NOT NULL CHECK (size = 0) DEFAULT 0
) INHERITS ("learn"."files");

ALTER TABLE "learn"."files" 
ADD CONSTRAINT "files_parent_id_fkey" 
FOREIGN KEY (parent_id) REFERENCES "learn"."directories"(id) NOT VALID;

ALTER TABLE "learn"."directories" 
ADD CONSTRAINT "directories_parent_id_fkey" 
FOREIGN KEY (parent_id) REFERENCES "learn"."files"(id) NOT VALID;

Explanation

Since Postgres doesn’t guarantee sequential or identity primary keys to be unique among all tables in the hierarchy, our table hierarchy requires a unique identifier generated by random numbers. Ultimately, the UUID type is a simple approach in Postgres for generating unique identifiers for this constraint. The base table files represent objects, directories, and links. It includes common attributes like id, name, size, content_type, tenant_id, and parent_id. The directories table inherits from files, thereby acquiring all its columns. Additionally, it introduces number_of_files, files_limit, and size_limit specific to directories. Likewise directories, the links table acquires all file columns. It allows you to bookmark a file or directory in any part of the hierarchy.

Inserting Data

INSERT INTO "learn"."directories" (id, name, tenant_id) 
VALUES ('3f5996b8-b8ac-4934-a664-ab94662ba5b9', 'secrets', 1);

INSERT INTO "learn"."directories" (id, name, tenant_id, parent_id) 
VALUES ('ead5f80b-57c5-4230-b02a-7b4dd9d39c02', 'super_secrets', 1, '3f5996b8-b8ac-4934-a664-ab94662ba5b9');

INSERT INTO "learn"."files" (name, tenant_id, size, parent_id) 
VALUES ('a.pdf', 1, 2, '3f5996b8-b8ac-4934-a664-ab94662ba5b9');

INSERT INTO "learn"."files" (name, tenant_id, size, parent_id) 
VALUES ('b.pdf', 1, 3, '3f5996b8-b8ac-4934-a664-ab94662ba5b9');

INSERT INTO "learn"."files" (name, tenant_id, size, parent_id) 
VALUES ('c.pdf', 1, 4, 'ead5f80b-57c5-4230-b02a-7b4dd9d39c02');

Selecting Data

SELECT id, size, parent_id FROM "learn"."files";

Aggregated Columns: Maintaining Metrics

In our storage authorization model, we need to aggregate metrics like the number of files, total size, and cost. These metrics are essential for enforcing subscription limits and providing transparency to users.

Ensuring that these aggregated columns remain consistent requires careful consideration. PostgreSQL offers several approaches to handle this:

Views and materialized views with recursive CTEs
Functions
Computed Fields (Generated Columns)
Triggers
Background Jobs/Scheduled Tasks

References and Further Reading

Composite Pattern in PostgreSQL: Building Polymorphism Data Models for WebScale Apps

Intent

Motivation

Storage Authorization Model

Implementing the Composite Pattern with PostgreSQL Table Inheritance

Setting Up the Schema

Inserting Data

Selecting Data

Aggregated Columns: Maintaining Metrics

References and Further Reading

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Carlos Eduardo Sanchez Torres

No responses yet