Building Multi-Tenant Next.js Platforms Without Slowing Delivery

Every platform team eventually faces the same architectural question: how do we serve multiple tenants from a single codebase without one tenant's data leaking into another's experience? The answer seems straightforward until you start building. Then you discover that multi-tenancy is not a single problem. It is a stack of overlapping concerns: data isolation, authentication scoping, authorization boundaries, billing separation, performance isolation, and deployment strategy. Get any one of them wrong and you have either a security incident or an engineering bottleneck.

Most teams over-engineer the solution. They provision separate databases per tenant, deploy separate application instances, and build elaborate routing layers to direct traffic to the correct silo. This approach works. It also triples your infrastructure cost, makes deployments take four times longer, and creates a maintenance burden that scales linearly with your tenant count. When you have five tenants, it is manageable. When you have fifty, it is a full-time job. When you have five hundred, it is a team.

There is a pragmatic middle path: shared infrastructure with database-level isolation. One database. One deployment. One codebase. But with row-level security (RLS) policies that make it physically impossible for one tenant to access another tenant's data. This is the approach we use in production, and it delivers the isolation guarantees of separate databases without the operational overhead of separate deployments.

The Architecture: Shared Database, Isolated Rows

The core principle is simple. Every table that contains tenant-specific data includes a tenant_id column. Every query against that table is automatically filtered by the current tenant's ID. The filtering happens at the database level, not the application level. This means that even if your application code has a bug that forgets to include a WHERE tenant_id = ? clause, the database itself will enforce the filter.

In Supabase (which wraps PostgreSQL), this is implemented through RLS policies. Here is what a typical tenant-isolated table looks like:

-- Create the table with tenant_id as a required column
CREATE TABLE projects (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  tenant_id UUID NOT NULL REFERENCES tenants(id),
  name TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'active',
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Enable RLS on the table
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

-- Policy: users can only see rows belonging to their tenant
CREATE POLICY "tenant_isolation_select" ON projects
  FOR SELECT
  USING (
    tenant_id = (
      SELECT tenant_id FROM user_profiles
      WHERE user_id = auth.uid()
    )
  );

-- Policy: users can only insert rows for their own tenant
CREATE POLICY "tenant_isolation_insert" ON projects
  FOR INSERT
  WITH CHECK (
    tenant_id = (
      SELECT tenant_id FROM user_profiles
      WHERE user_id = auth.uid()
    )
  );

-- Policy: users can only update their own tenant's rows
CREATE POLICY "tenant_isolation_update" ON projects
  FOR UPDATE
  USING (
    tenant_id = (
      SELECT tenant_id FROM user_profiles
      WHERE user_id = auth.uid()
    )
  )
  WITH CHECK (
    tenant_id = (
      SELECT tenant_id FROM user_profiles
      WHERE user_id = auth.uid()
    )
  );

-- Policy: users can only delete their own tenant's rows
CREATE POLICY "tenant_isolation_delete" ON projects
  FOR DELETE
  USING (
    tenant_id = (
      SELECT tenant_id FROM user_profiles
      WHERE user_id = auth.uid()
    )
  );

-- Index for performance
CREATE INDEX idx_projects_tenant_id ON projects(tenant_id);

The auth.uid() function returns the currently authenticated user's ID from the JWT token. The subquery resolves that user ID to a tenant ID through the user_profiles table. Every SELECT, INSERT, UPDATE, and DELETE operation is filtered through this resolution. There is no way to bypass it from the client side.

RLS policies execute at the database level before any row is returned to the application. This means your application code does not need to remember to filter by tenant_id. The database enforces isolation regardless of what your application does. This is the critical difference between application-level and database-level multi-tenancy: the database is the enforcement boundary, not your code.

Middleware-Based Tenant Resolution

While RLS handles data isolation at the database level, your application still needs to know which tenant the current request belongs to. This is where middleware comes in. In a Next.js application, middleware runs before every request and can inspect headers, cookies, and URL patterns to resolve the current tenant.

There are three common strategies for tenant resolution, each with different tradeoffs:

Strategy 1: Subdomain Resolution

// middleware.ts
import { NextRequest, NextResponse } from 'next/server';

export function middleware(request: NextRequest) {
  const hostname = request.headers.get('host') || '';
  const subdomain = hostname.split('.')[0];

  // Skip for main domain and known subdomains
  if (subdomain === 'www' || subdomain === 'app') {
    return NextResponse.next();
  }

  // Resolve tenant from subdomain
  const response = NextResponse.next();
  response.headers.set('x-tenant-id', subdomain);
  return response;
}

Subdomain resolution is clean and user-friendly. Each tenant gets their own URL: acme.yourplatform.com. The downside is DNS and SSL complexity. You need wildcard DNS records and wildcard SSL certificates, which adds infrastructure requirements.

Strategy 2: JWT Claim Resolution

// middleware.ts
import { createMiddlewareClient } from '@supabase/auth-helpers-nextjs';
import { NextRequest, NextResponse } from 'next/server';

export async function middleware(request: NextRequest) {
  const response = NextResponse.next();
  const supabase = createMiddlewareClient({ req: request, res: response });

  const { data: { session } } = await supabase.auth.getSession();

  if (!session) {
    return NextResponse.redirect(new URL('/login', request.url));
  }

  // Tenant ID is embedded in the JWT during authentication
  const tenantId = session.user.app_metadata?.tenant_id;

  if (!tenantId) {
    return NextResponse.redirect(new URL('/onboarding', request.url));
  }

  response.headers.set('x-tenant-id', tenantId);
  return response;
}

JWT claim resolution is the most secure approach because the tenant ID is cryptographically signed inside the authentication token. It cannot be spoofed by modifying a URL or header. The downside is that changing a user's tenant assignment requires issuing a new JWT, which means a re-authentication event.

Strategy 3: Header Resolution (API Clients)

// For server-to-server API calls
export function resolveTenant(request: NextRequest): string {
  const tenantId = request.headers.get('x-tenant-id');
  const apiKey = request.headers.get('x-api-key');

  if (!tenantId || !apiKey) {
    throw new Error('Missing tenant identification');
  }

  // Verify API key belongs to the claimed tenant
  // This prevents tenant ID spoofing
  return tenantId;
}

Header resolution is appropriate for API clients and server-to-server communication where subdomain routing does not apply. The critical requirement is that the tenant ID header must be verified against the authentication credential. Never trust a tenant ID header without verifying that the authenticated caller actually belongs to that tenant.

Tenant-Scoped API Routes

With RLS handling data isolation and middleware resolving the current tenant, your API routes become remarkably clean. They do not need to include tenant filtering logic because the database handles it automatically. Here is what a typical tenant-scoped API route looks like:

// app/api/projects/route.ts
import { createRouteHandlerClient } from '@supabase/auth-helpers-nextjs';
import { cookies } from 'next/headers';
import { NextResponse } from 'next/server';

export async function GET() {
  const supabase = createRouteHandlerClient({ cookies });

  // No WHERE clause for tenant_id needed.
  // RLS policies automatically filter to the
  // current user's tenant.
  const { data, error } = await supabase
    .from('projects')
    .select('id, name, status, created_at')
    .order('created_at', { ascending: false });

  if (error) {
    return NextResponse.json(
      { error: 'Failed to fetch projects' },
      { status: 500 }
    );
  }

  return NextResponse.json({ projects: data });
}

export async function POST(request: Request) {
  const supabase = createRouteHandlerClient({ cookies });
  const body = await request.json();

  // The tenant_id is set automatically by the RLS
  // INSERT policy's WITH CHECK clause, or we can
  // resolve it from the user's profile
  const { data: profile } = await supabase
    .from('user_profiles')
    .select('tenant_id')
    .single();

  const { data, error } = await supabase
    .from('projects')
    .insert({
      tenant_id: profile!.tenant_id,
      name: body.name,
      status: 'active'
    })
    .select()
    .single();

  if (error) {
    return NextResponse.json(
      { error: 'Failed to create project' },
      { status: 500 }
    );
  }

  return NextResponse.json({ project: data }, { status: 201 });
}

The GET handler contains no tenant filtering whatsoever. The SELECT query returns all rows from the projects table, but the RLS policy silently filters it to only rows belonging to the current user's tenant. This is the power of database-level isolation: your application code cannot accidentally leak data because the database will not return it.

Admin vs Client Auth Separation

Multi-tenant platforms typically have two classes of users: tenant users who operate within their own tenant's boundary, and platform administrators who need cross-tenant visibility. These two classes require fundamentally different authorization models.

Tenant users authenticate through the standard Supabase auth flow and receive a JWT with their tenant ID embedded. All their database queries pass through RLS policies. They can never see data outside their tenant, and there is no application-level code path that allows them to do so.

Platform administrators need a different approach. They must be able to query across tenants for support operations, analytics, billing, and platform management. This is where the Supabase service role comes in:

// lib/supabase-admin.ts
import { createClient } from '@supabase/supabase-js';

// CRITICAL: This client bypasses ALL RLS policies.
// It must ONLY be used in server-side admin routes.
// Never import this in client-facing code.
export const adminClient = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!,
  {
    auth: {
      autoRefreshToken: false,
      persistSession: false
    }
  }
);

// app/api/admin/tenants/route.ts
import { adminClient } from '@/lib/supabase-admin';
import { verifyAdminRole } from '@/lib/auth';

export async function GET(request: Request) {
  // Verify the caller is a platform admin
  const admin = await verifyAdminRole(request);
  if (!admin) {
    return NextResponse.json(
      { error: 'Unauthorized' },
      { status: 403 }
    );
  }

  // Service role bypasses RLS, returns ALL tenants
  const { data, error } = await adminClient
    .from('tenants')
    .select(`
      id,
      name,
      plan,
      projects:projects(count)
    `)
    .order('created_at', { ascending: false });

  return NextResponse.json({ tenants: data });
}

The service role client is the most dangerous object in your codebase. It bypasses every RLS policy on every table. If it leaks into a client-facing route, your entire multi-tenancy model collapses. This is why we enforce a governance rule in CI that verifies service role imports only exist in /admin/ or /server/ directories. The rule is checked on every commit, and the pipeline fails if a violation is detected.

The service role key should be treated with the same security posture as a database root password. It grants unrestricted access to all data across all tenants. Store it only in environment variables, never in client bundles, and audit every file that imports it.

Migration Strategy for Tenant-Aware Schemas

Adding multi-tenancy to an existing schema requires careful migration planning. The general approach follows three phases: add the column, backfill existing data, then enable RLS.

-- Migration 001: Add tenant_id to existing tables
ALTER TABLE projects
  ADD COLUMN tenant_id UUID REFERENCES tenants(id);

-- Migration 002: Backfill tenant_id for existing rows
-- Map existing users to their tenants first
UPDATE projects p
SET tenant_id = (
  SELECT up.tenant_id
  FROM user_profiles up
  WHERE up.user_id = p.created_by
);

-- Migration 003: Make tenant_id NOT NULL after backfill
ALTER TABLE projects
  ALTER COLUMN tenant_id SET NOT NULL;

-- Migration 004: Add index and enable RLS
CREATE INDEX idx_projects_tenant_id
  ON projects(tenant_id);

ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

-- Migration 005: Create RLS policies
CREATE POLICY "tenant_select" ON projects
  FOR SELECT USING (
    tenant_id = (
      SELECT tenant_id FROM user_profiles
      WHERE user_id = auth.uid()
    )
  );

The critical ordering here is: add column, backfill, enforce NOT NULL, create index, enable RLS, then create policies. If you enable RLS before creating policies, all queries will return zero rows (PostgreSQL's default behavior when RLS is enabled with no policies is to deny all access). If you enforce NOT NULL before backfilling, existing rows will violate the constraint. The migration sequence matters.

Performance Considerations

The tenant_id index is not optional. Without it, every RLS policy evaluation requires a sequential scan of the table to find rows matching the current tenant. With the index, PostgreSQL can efficiently filter to the correct tenant before applying any additional query predicates.

For high-traffic tables, consider a composite index that includes tenant_id as the leading column alongside your most common query predicates:

-- Composite index for common query patterns
CREATE INDEX idx_projects_tenant_status
  ON projects(tenant_id, status);

CREATE INDEX idx_projects_tenant_created
  ON projects(tenant_id, created_at DESC);

-- For EXPLAIN analysis, verify the RLS filter
-- uses the index:
EXPLAIN ANALYZE
SELECT * FROM projects
WHERE status = 'active'
ORDER BY created_at DESC
LIMIT 20;

Run EXPLAIN ANALYZE on your most common queries to verify that the RLS policy's tenant filter is using the index rather than a sequential scan. In our production system, adding the composite index reduced p95 query latency from 120ms to 8ms on a table with 200,000 rows across 40 tenants.

Testing Multi-Tenant Isolation

The most important test in a multi-tenant system is the cross-tenant isolation test. This test verifies that data created by one tenant is invisible to another tenant, even when the application code does not explicitly filter by tenant. Here is how we structure these tests:

describe('Multi-tenant isolation', () => {
  let tenantA: TestTenant;
  let tenantB: TestTenant;

  beforeAll(async () => {
    // Create two test tenants with separate users
    tenantA = await createTestTenant('Tenant A');
    tenantB = await createTestTenant('Tenant B');
  });

  it('prevents cross-tenant data access on SELECT', async () => {
    // Tenant A creates a project
    const { data: project } = await tenantA.client
      .from('projects')
      .insert({ name: 'Secret Project', tenant_id: tenantA.id })
      .select()
      .single();

    expect(project).toBeTruthy();

    // Tenant B queries all projects
    const { data: projects } = await tenantB.client
      .from('projects')
      .select('*');

    // Tenant B should NOT see Tenant A's project
    const leaked = projects?.find(p => p.id === project!.id);
    expect(leaked).toBeUndefined();
  });

  it('prevents cross-tenant data modification', async () => {
    // Tenant A creates a project
    const { data: project } = await tenantA.client
      .from('projects')
      .insert({ name: 'Protected Project', tenant_id: tenantA.id })
      .select()
      .single();

    // Tenant B attempts to update Tenant A's project
    const { error } = await tenantB.client
      .from('projects')
      .update({ name: 'Hijacked' })
      .eq('id', project!.id);

    // The update should affect zero rows (RLS blocks it)
    // Verify the original is unchanged
    const { data: original } = await tenantA.client
      .from('projects')
      .select('name')
      .eq('id', project!.id)
      .single();

    expect(original!.name).toBe('Protected Project');
  });

  it('prevents cross-tenant deletion', async () => {
    // Tenant A creates a project
    const { data: project } = await tenantA.client
      .from('projects')
      .insert({ name: 'Undeletable', tenant_id: tenantA.id })
      .select()
      .single();

    // Tenant B attempts to delete Tenant A's project
    await tenantB.client
      .from('projects')
      .delete()
      .eq('id', project!.id);

    // Verify the project still exists for Tenant A
    const { data: stillExists } = await tenantA.client
      .from('projects')
      .select('id')
      .eq('id', project!.id)
      .single();

    expect(stillExists).toBeTruthy();
  });

  it('prevents tenant_id spoofing on INSERT', async () => {
    // Tenant B attempts to insert a row with Tenant A's ID
    const { error } = await tenantB.client
      .from('projects')
      .insert({
        name: 'Spoofed Project',
        tenant_id: tenantA.id  // Wrong tenant!
      });

    // RLS WITH CHECK should reject this
    expect(error).toBeTruthy();
  });
});

These four tests cover the complete CRUD surface. Every operation that could potentially leak data across tenants is explicitly verified. The tenant_id spoofing test (the final one) is particularly important: it verifies that a user cannot insert data into another tenant's namespace by manually setting the tenant_id field. The RLS WITH CHECK clause on the INSERT policy catches this.

If you have one test in your multi-tenant system, make it the cross-tenant SELECT isolation test. If a tenant can see another tenant's data, nothing else matters. Not your features, not your performance, not your UI. Data isolation is the foundation. Everything else is built on top of it.

Why This Approach Scales Without Slowing You Down

The shared infrastructure model with RLS has a counterintuitive property: it gets more efficient as you add tenants, not less. Each new tenant is a row in the tenants table and a set of RLS-filtered rows in your data tables. There is no new database to provision, no new deployment to configure, no new DNS record to create. The marginal cost of adding a tenant is effectively zero.

Deployments remain simple because there is one codebase and one deployment target. A bug fix ships to all tenants simultaneously. A new feature is available to all tenants the moment it deploys. There is no fleet of instances to coordinate, no blue-green deployment across dozens of isolated environments. You deploy once and everyone gets the update.

The development experience is also dramatically better. Engineers work in a single codebase with a single database. They run one set of migrations, one test suite, one CI pipeline. The cognitive overhead of multi-tenancy is concentrated in the RLS policies and the middleware, not scattered across every query in the application. A new engineer can understand the isolation model by reading five SQL files and one middleware function, not by reverse-engineering a fleet management system.

There are limits to this approach. If tenants have vastly different performance profiles (one tenant generates 100x the traffic of others), you may need to consider connection pooling strategies or read replicas. If tenants require data residency in specific geographic regions, you will need a more sophisticated routing layer. And if tenants need completely independent upgrade cycles, the shared deployment model will not work.

But for the majority of multi-tenant platforms, where tenants share the same features, the same schema, and the same deployment cadence, the shared infrastructure model with RLS delivers the strongest possible isolation guarantee with the lowest possible operational burden. It is not the architecture that looks impressive on a whiteboard. It is the architecture that ships fast and sleeps well.

The best multi-tenancy architecture is the one where you forget it exists during daily development. RLS policies make isolation invisible to your application code, middleware makes tenant resolution automatic, and your API routes look identical to single-tenant routes. The complexity is concentrated in the infrastructure layer where it belongs, not distributed across every feature you build.