Composite Pattern in PostgreSQL: Building Polymorphism Data Models for WebScale Apps
In software engineering, the Composite Pattern is a structural design pattern that allows you to compose objects into tree structures to represent part-whole hierarchies. It provides a uniform interface for interacting with both individual objects and compositions of objects, eliminating redundancy in data and operations.
While the classic composite pattern is typically applied within in-memory programming environments dealing with transient data, many modern applications require handling persistent, data-intensive hierarchies. This is especially true for domains like business analytics, intelligence, and authorization models, where hierarchical, polymorphism data structures are fundamental. We’ll explore implementing the Composite Pattern using different implementation strategies in PostgreSQL.
Intent
The primary goal of the Composite Pattern is to compose entities into tree structures that represent part-whole hierarchies, providing a uniform interface across the hierarchy without redundant data or operations. Unlike traditional implementations that focus on in-memory structures, our approach leverages PostgreSQL’s capabilities to manage hierarchical data persistently, ensuring consistency in data-intensive applications.
Motivation
In data-intensive applications often require aggregating metrics from similar entities and enabling users to build complex data models to define operational semantics.
Consider building a data model for a SaaS storage authorization system. The service offers various subscription tiers — Basic, Premium, and Enterprise — each imposing different limits on bucket sizes and the number of files per tenant. Administrators should also be able to impose similar constraints. Additionally, the system needs to efficiently list files irrespective of whether they reside within a directory, with pricing based on the number of files and their sizes. As a transparent company, providing the cost per file is essential for user trust and clarity.
Storage Authorization Model

- File: Represents individual files with attributes like
id
,size
,content_type
,tenant_id
, andparent_id
. - Directory: Inherits from
File
and includes additional attributes likenumber_of_files
,size_limit
, andfiles_limit
. It also provides methods tosum_size()
andcount_files()
.
Implementing the Composite Pattern with PostgreSQL Table Inheritance
Different database systems offer varying support for table inheritance. For illustration, we’ll use PostgreSQL, which provides robust inheritance features within its SQL-based engine. This allows us to implement the Composite Pattern effectively for persistent data structures.
Setting Up the Schema
First, we’ll create a schema and define the base and derived tables using PostgreSQL’s inheritance:
DROP SCHEMA IF EXISTS "learn" CASCADE;
CREATE SCHEMA IF NOT EXISTS "learn";
CREATE TABLE "learn"."files" (
"id" uuid NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
"name" text NOT NULL,
"size" double precision NOT NULL CHECK (size >= 0) DEFAULT 0,
"content_type" text NOT NULL DEFAULT 'pdf'::text,
"cost" double precision DEFAULT 0,
"tenant_id" integer NOT NULL,
"parent_id" uuid
);
CREATE TABLE "learn"."directories" (
"id" uuid NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
"number_of_files" integer DEFAULT 0,
"files_limit" double precision DEFAULT 10,
"size_limit" double precision DEFAULT 10
) INHERITS ("learn"."files");
CREATE TABLE "learn"."links" (
"id" uuid NOT NULL DEFAULT gen_random_uuid() PRIMARY KEY,
"source_directory" uuid REFERENCES "learn"."directories"(id),
"source_file" uuid REFERENCES "learn"."files"(id),
"destination_directory" uuid REFERENCES "learn"."directories"(id),
"destination_file" uuid REFERENCES "learn"."files"(id),
"size" double precision NOT NULL CHECK (size = 0) DEFAULT 0
) INHERITS ("learn"."files");
ALTER TABLE "learn"."files"
ADD CONSTRAINT "files_parent_id_fkey"
FOREIGN KEY (parent_id) REFERENCES "learn"."directories"(id) NOT VALID;
ALTER TABLE "learn"."directories"
ADD CONSTRAINT "directories_parent_id_fkey"
FOREIGN KEY (parent_id) REFERENCES "learn"."files"(id) NOT VALID;
Explanation
Since Postgres doesn’t guarantee sequential or identity primary keys to be unique among all tables in the hierarchy, our table hierarchy requires a unique identifier generated by random numbers. Ultimately, the UUID type is a simple approach in Postgres for generating unique identifiers for this constraint. The base table files represent objects, directories, and links. It includes common attributes like id
, name
, size
, content_type
, tenant_id
, and parent_id
. The directories table inherits from files
, thereby acquiring all its columns. Additionally, it introduces number_of_files
, files_limit
, and size_limit
specific to directories. Likewise directories, the links table acquires all file columns. It allows you to bookmark a file or directory in any part of the hierarchy.
Inserting Data
INSERT INTO "learn"."directories" (id, name, tenant_id)
VALUES ('3f5996b8-b8ac-4934-a664-ab94662ba5b9', 'secrets', 1);
INSERT INTO "learn"."directories" (id, name, tenant_id, parent_id)
VALUES ('ead5f80b-57c5-4230-b02a-7b4dd9d39c02', 'super_secrets', 1, '3f5996b8-b8ac-4934-a664-ab94662ba5b9');
INSERT INTO "learn"."files" (name, tenant_id, size, parent_id)
VALUES ('a.pdf', 1, 2, '3f5996b8-b8ac-4934-a664-ab94662ba5b9');
INSERT INTO "learn"."files" (name, tenant_id, size, parent_id)
VALUES ('b.pdf', 1, 3, '3f5996b8-b8ac-4934-a664-ab94662ba5b9');
INSERT INTO "learn"."files" (name, tenant_id, size, parent_id)
VALUES ('c.pdf', 1, 4, 'ead5f80b-57c5-4230-b02a-7b4dd9d39c02');
Selecting Data
SELECT id, size, parent_id FROM "learn"."files";
Aggregated Columns: Maintaining Metrics
In our storage authorization model, we need to aggregate metrics like the number of files, total size, and cost. These metrics are essential for enforcing subscription limits and providing transparency to users.
Ensuring that these aggregated columns remain consistent requires careful consideration. PostgreSQL offers several approaches to handle this:
- Views and materialized views with recursive CTEs
- Functions
- Computed Fields (Generated Columns)
- Triggers
- Background Jobs/Scheduled Tasks