Introducing Yellowbrick Test Containers

Aug 17, 2025 · 21 min read · yellowbrick tdd testcontainers ·

Testing with Confidence: Introducing the Yellowbrick Test Container for Spring Boot

In the world of modern software development, integration testing has become crucial for building reliable applications. When your application depends on a specialized database like Yellowbrick, ensuring your tests run against a real database instance becomes even more important. Today, we're excited to introduce the Yellowbrick Test Container - a powerful testing tool that brings the full capabilities of Yellowbrick Database directly into your Spring Boot test suite.

Test-Driven Development with CI/CD Integration

The Yellowbrick Test Container truly shines when integrated into a comprehensive Test-Driven Development (TDD) workflow within your CI/CD pipeline. The following diagram illustrates how the container facilitates confidence from development through production deployment:

The Complete TDD Workflow

Development Environment (Local TDD Cycle)

The development process begins with the classic TDD cycle—Red, Green, Refactor—but with a crucial difference: instead of testing against mock databases or PostgreSQL substitutes, developers write failing tests against a real Yellowbrick instance running locally via the test container.

 1// Red: Write a failing test for Yellowbrick-specific functionality
 2@Test
 3void shouldDistributeDataCorrectly() {
 4    // This test initially fails - no table exists yet
 5    jdbcTemplate.execute("""
 6        CREATE TABLE user_analytics (
 7            user_id INTEGER,
 8            event_count INTEGER,
 9            last_activity TIMESTAMP
10        ) DISTRIBUTE ON (user_id)
11    """);
12    
13    // Test will fail until implementation is complete
14    var result = yellowbrick.executeQuery(
15        "SELECT distribution_key FROM sys.table WHERE name = 'user_analytics'"
16    );
17    assertThat(result.getStdout()).contains("user_id");
18}

Green Phase: Developers implement the minimal code to make the test pass, knowing their solution works with actual Yellowbrick distribution strategies, JSON handling, and system tables.

Refactor Phase: Code improvements are validated against the real database, ensuring optimizations don't break Yellowbrick-specific functionality.

CI/CD Pipeline Integration

When developers push code, the CI/CD pipeline automatically:

Builds and Compiles: Standard Maven/Gradle build process
Runs Unit Tests: Fast, isolated tests for business logic
Executes Integration Tests: The critical phase where Yellowbrick Test Container spins up a real database instance
Validates Production Readiness: Security scans and quality checks
Packages and Deploys: Creates artifacts ready for production

The Test Container Advantage in CI

In the CI environment, the test container provides the same benefits as local development:

 1# GitHub Actions example
 2name: CI Pipeline
 3on: [push, pull_request]
 4
 5jobs:
 6  test:
 7    runs-on: ubuntu-latest
 8    steps:
 9    - uses: actions/checkout@v3
10    - name: Set up JDK 17
11      uses: actions/setup-java@v3
12      with:
13        java-version: '17'
14    - name: Run Integration Tests
15      run: mvn test
16      env:
17        # Test container automatically handles Yellowbrick setup
18        TESTCONTAINERS_CHECKS_DISABLE: true

The container automatically:

Pulls the Yellowbrick Community Edition image
Starts a real database cluster
Executes bootstrap scripts for schema setup
Runs all integration tests against actual Yellowbrick features
Cleans up resources after test completion

Production Confidence

By the time code reaches production, teams have high confidence because:

Distribution Strategies have been tested against real Yellowbrick partitioning
JSON Operations have been validated with actual Yellowbrick JSON support
System Table Queries have been verified against real sys.* tables
SQL Compatibility has been proven with the actual Yellowbrick SQL engine
Performance Characteristics are understood from container testing

Key Benefits of This Workflow

🔄 Continuous Validation: Every code change is validated against real Yellowbrick functionality, not approximations.

⚡ Fast Feedback: Developers get immediate feedback about Yellowbrick compatibility without needing to deploy to staging environments.

🛡️ Risk Reduction: Production deployments have significantly lower risk because database-specific features have been thoroughly tested.

📈 Velocity Increase: Teams can move faster knowing their tests provide accurate validation of production behavior.

🎯 Feature Confidence: New features using advanced Yellowbrick capabilities (analytics functions, distribution strategies, JSON operations) are tested from day one.

This integrated approach transforms database testing from a deployment-time concern into a development-time advantage, enabling teams to build robust applications with confidence in their Yellowbrick integration.

Why Test Containers Matter

The Problem with Traditional Testing Approaches

Traditionally, developers have faced several challenges when testing database-dependent applications:

In-Memory Database Limitations: While H2 or similar in-memory databases are fast, they don't replicate the exact behavior, SQL dialect, or performance characteristics of your production database.
Shared Test Environments: Using a shared test database often leads to test interference, inconsistent state, and the dreaded "works on my machine" syndrome.
Complex Setup: Installing and maintaining local database instances for each developer and CI environment is time-consuming and error-prone.
Version Mismatches: Keeping test environments synchronized with production database versions becomes a maintenance nightmare.

The Test Container Solution

Test containers solve these problems by providing:

Real Database Instances: Your tests run against the actual database you use in production
Isolation: Each test suite gets a fresh, clean database instance
Reproducibility: Consistent behavior across development machines and CI environments
Zero Configuration: No need to install or manage database instances locally
Version Control: Pin exact database versions in your test configuration

Why Not Just Use PostgreSQL Test Container?

While Yellowbrick is PostgreSQL-compatible and uses the PostgreSQL wire protocol, it's important to understand that compatibility doesn't mean identical. Using a PostgreSQL test container as a substitute for Yellowbrick testing is inadvisable for several critical reasons:

Additional Yellowbrick Features Not in PostgreSQL:

Distribution Strategies: Yellowbrick's DISTRIBUTE ON clause for table distribution across nodes
Columnar Storage: Advanced columnar storage optimizations and compression
System Tables: Yellowbrick-specific system tables like sys.cluster, sys.schema, sys.table
Workload Management: Query routing and resource management features
Advanced Analytics: Specialized functions for time-series and analytical workloads

Missing PostgreSQL Features in Yellowbrick:

Certain Extensions: Some PostgreSQL extensions may not be available
Advanced Indexing: Some PostgreSQL index types may not be supported
Stored Procedures: Differences in stored procedure implementations
Replication Features: PostgreSQL-specific replication and streaming features

Behavioral Differences:

Query Optimization: Different query planners and execution strategies
Data Types: Subtle differences in data type handling and precision
Concurrency: Different locking and transaction isolation behaviors
Performance Characteristics: Vastly different performance profiles for analytical vs. transactional workloads

Testing against PostgreSQL when your production system uses Yellowbrick creates a false sense of security and can lead to production issues that weren't caught during testing.

Introducing the Yellowbrick Test Container

The Yellowbrick Test Container extends the popular Testcontainers framework to support Yellowbrick Database Community Edition. This means you can now run comprehensive integration tests against a real Yellowbrick instance without any manual setup.

Key Benefits

🎯 True Compatibility: While Yellowbrick is PostgreSQL-compatible, it has unique features and limitations that PostgreSQL test containers cannot replicate.

🚀 Production-Like Testing: Test against the same database engine, SQL dialect, and features you'll use in production.

🔧 Zero Configuration: Start testing immediately without installing Yellowbrick locally.

🏗️ Spring Boot Integration: Seamless integration with Spring Boot's testing framework and dependency injection.

📊 Yellowbrick-Specific Features: Test distribution strategies, system tables, and other Yellowbrick-specific functionality.

⚡ Automated Lifecycle: Container automatically starts before tests and cleans up afterward.

Setting Up the Yellowbrick Test Container

Important System Requirements and Limitations

⚠️ Platform Compatibility Notice

At this time, the Yellowbrick Test Container has specific system requirements and limitations:

Supported Platforms:

AMD64 (x86_64) architecture only - ARM64/Apple Silicon (M1/M2/M3) is not currently supported
Linux and macOS hosts with AMD64 processors
Windows with WSL2 and AMD64 processors

Docker Requirements:

Docker Desktop v4.38.0 or newer, OR
Docker Engine v26.1.3 or newer
Minimum 12GB RAM allocated to Docker (configure in Docker Desktop settings)
Minimum 6 vCPU cores allocated to Docker

Database Version:

Uses Yellowbrick Community Edition (yellowbrick-ce) container image
May have feature limitations compared to Yellowbrick Enterprise

Resource Requirements: The Yellowbrick database is resource-intensive and requires substantial system resources:

Host System: Recommended 16GB+ total RAM, 8+ CPU cores
Docker Allocation: Must allocate at least 12GB RAM and 6 vCPU to Docker
Disk Space: Several GB for container image and database storage

Checking Your Docker Configuration:

1# Check Docker version
2docker --version
3
4# Check Docker system info and resource limits
5docker system info
6
7# Verify available resources
8docker run --rm alpine:latest sh -c 'echo "CPUs: $(nproc), RAM: $(free -h)"'

If your system doesn't meet these requirements, the container may fail to start or experience performance issues during tests.

Prerequisites

Since the Yellowbrick Test Container is not yet available in Maven Central, you'll need to build it locally first.

Step 1: Clone and Install the Dependency

1# Clone the repository (URL to be determined)
2git clone [TBD_URL]/yellowbrick-testcontainer.git
3cd yellowbrick-testcontainer
4
5# Install to your local Maven repository
6./mvn install

Step 2: Add Dependency to Your Project

Add the following dependency to your pom.xml:

1<dependency>
2    <groupId>com.yellowbrick</groupId>
3    <artifactId>yellowbrick-testcontainer</artifactId>
4    <version>1.0.0-SNAPSHOT</version>
5    <scope>test</scope>
6</dependency>

You'll also need these supporting dependencies:

 1<dependencies>
 2    <!-- Spring Boot Test Starter -->
 3    <dependency>
 4        <groupId>org.springframework.boot</groupId>
 5        <artifactId>spring-boot-starter-test</artifactId>
 6        <scope>test</scope>
 7    </dependency>
 8    
 9    <!-- Testcontainers JUnit 5 Support -->
10    <dependency>
11        <groupId>org.testcontainers</groupId>
12        <artifactId>junit-jupiter</artifactId>
13        <scope>test</scope>
14    </dependency>
15    
16    <!-- PostgreSQL Driver (Yellowbrick uses PostgreSQL protocol) -->
17    <dependency>
18        <groupId>org.postgresql</groupId>
19        <artifactId>postgresql</artifactId>
20        <scope>test</scope>
21    </dependency>
22    
23    <!-- HikariCP Connection Pool -->
24    <dependency>
25        <groupId>com.zaxxer</groupId>
26        <artifactId>HikariCP</artifactId>
27    </dependency>
28    
29    <!-- Spring JDBC -->
30    <dependency>
31        <groupId>org.springframework</groupId>
32        <artifactId>spring-jdbc</artifactId>
33    </dependency>
34</dependencies>

Complete Example: Testing with Yellowbrick

Let's walk through a comprehensive example that demonstrates the power of the Yellowbrick Test Container. This example shows how to test a Spring Boot application with real database operations.

Test Configuration

 1@SpringBootTest(
 2    classes = YellowbrickRepositoryTest.TestConfig.class,
 3    webEnvironment = SpringBootTest.WebEnvironment.NONE,
 4    properties = {"spring.profiles.active=test"}
 5)
 6@Testcontainers
 7@Timeout(value = 15, unit = TimeUnit.MINUTES) // Yellowbrick needs time to start
 8@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_CLASS)
 9class YellowbrickRepositoryTest {
10
11    @Container
12    static YellowbrickContainer yellowbrick = YellowbrickContainer.create()
13            .withLogConsumer(outputFrame -> 
14                System.out.print("[YELLOWBRICK] " + outputFrame.getUtf8String()));
15
16    @SpringBootConfiguration
17    static class TestConfig {
18        @Bean
19        @Primary
20        public DataSource dataSource() {
21            HikariDataSource dataSource = new HikariDataSource();
22            dataSource.setJdbcUrl(yellowbrick.getJdbcUrl());
23            dataSource.setUsername(yellowbrick.getUsername());
24            dataSource.setPassword(yellowbrick.getPassword());
25            dataSource.setDriverClassName("org.postgresql.Driver");
26            
27            // Conservative settings for test environment
28            dataSource.setMaximumPoolSize(5);
29            dataSource.setConnectionTimeout(60000);
30            dataSource.setValidationTimeout(10000);
31            
32            return dataSource;
33        }
34
35        @Bean
36        public JdbcTemplate jdbcTemplate(DataSource dataSource) {
37            return new JdbcTemplate(dataSource);
38        }
39    }
40
41    @DynamicPropertySource
42    static void configureProperties(DynamicPropertyRegistry registry) {
43        registry.add("spring.datasource.url", yellowbrick::getJdbcUrl);
44        registry.add("spring.datasource.username", yellowbrick::getUsername);
45        registry.add("spring.datasource.password", yellowbrick::getPassword);
46        registry.add("spring.datasource.driver-class-name", yellowbrick::getDriverClassName);
47    }
48}

Test Setup and Data Preparation

 1private JdbcTemplate jdbcTemplate;
 2
 3@BeforeEach
 4void setUp() {
 5    // Wait for Yellowbrick to be fully ready
 6    yellowbrick.waitUntilYellowbrickReady(Duration.ofMinutes(5));
 7    
 8    // Create JDBC template
 9    createJdbcTemplate();
10    
11    // Set up test data
12    setupTestData();
13}
14
15private void setupTestData() {
16    // Drop existing table if present
17    jdbcTemplate.execute("DROP TABLE IF EXISTS test_users CASCADE");
18
19    // Create table with Yellowbrick distribution strategy
20    jdbcTemplate.execute("""
21        CREATE TABLE test_users (
22            id INTEGER,
23            name VARCHAR(255),
24            email VARCHAR(255),
25            age INTEGER
26        ) DISTRIBUTE ON (id)
27    """);
28
29    // Insert sample data
30    jdbcTemplate.update(
31        "INSERT INTO test_users (id, name, email, age) VALUES (?, ?, ?, ?)",
32        1, "John Doe", "john@example.com", 30
33    );
34    jdbcTemplate.update(
35        "INSERT INTO test_users (id, name, email, age) VALUES (?, ?, ?, ?)",
36        2, "Jane Smith", "jane@example.com", 25
37    );
38}

Test Cases

Basic Connectivity Test

1@Test
2void shouldConnectToYellowbrick() {
3    String result = jdbcTemplate.queryForObject("SELECT current_database()", String.class);
4    assertThat(result).isNotNull();
5    System.out.println("Connected to database: " + result);
6}

Yellowbrick-Specific Feature Test

1@Test
2void shouldQueryYellowbrickVersion() {
3    String version = jdbcTemplate.queryForObject("SELECT version()", String.class);
4    assertThat(version).containsIgnoringCase("yellowbrick");
5    System.out.println("Yellowbrick version: " + version);
6}

Data Operations Test

 1@Test
 2void shouldFindAllUsers() {
 3    List<Map<String, Object>> users = jdbcTemplate.queryForList(
 4        "SELECT * FROM test_users ORDER BY name"
 5    );
 6
 7    assertThat(users).hasSize(2);
 8    assertThat(users.get(0).get("name")).isEqualTo("Jane Smith");
 9    assertThat(users.get(0).get("email")).isEqualTo("jane@example.com");
10    assertThat(users.get(0).get("age")).isEqualTo(25);
11}
12
13@Test
14void shouldInsertNewUser() {
15    int rowsAffected = jdbcTemplate.update(
16        "INSERT INTO test_users (id, name, email, age) VALUES (?, ?, ?, ?)",
17        3, "Alice Johnson", "alice@example.com", 28
18    );
19
20    assertThat(rowsAffected).isEqualTo(1);
21
22    Map<String, Object> insertedUser = jdbcTemplate.queryForMap(
23        "SELECT * FROM test_users WHERE id = ?", 3
24    );
25    assertThat(insertedUser.get("name")).isEqualTo("Alice Johnson");
26}

System Tables and Metadata Test

 1@Test
 2void shouldExecuteYellowbrickSpecificQueries() {
 3    // This test would fail with PostgreSQL test container
 4    // as sys.schema is Yellowbrick-specific
 5    List<Map<String, Object>> schemas = jdbcTemplate.queryForList(
 6        "SELECT name FROM sys.schema WHERE name NOT LIKE 'sys%'"
 7    );
 8
 9    assertThat(schemas).isNotEmpty();
10    System.out.println("Available schemas: " + schemas);
11}

Testing Yellowbrick Distribution Strategy

 1@Test
 2void shouldTestDistributionStrategy() {
 3    // Create table with Yellowbrick-specific DISTRIBUTE ON clause
 4    // This would fail or be ignored in PostgreSQL
 5    jdbcTemplate.execute("""
 6        CREATE TABLE distributed_test (
 7            id INTEGER,
 8            data VARCHAR(255)
 9        ) DISTRIBUTE ON (id)
10    """);
11
12    // Verify the distribution strategy was applied
13    var result = yellowbrick.executeQuery(
14        "SELECT distribution_key FROM sys.table WHERE name = 'distributed_test'"
15    );
16    
17    assertThat(result.getExitCode()).isEqualTo(0);
18    assertThat(result.getStdout()).contains("id");
19}

Direct ybsql Command Test

1@Test
2void shouldTestYellowbrickDistribution() throws Exception {
3    var result = yellowbrick.executeQuery("SELECT COUNT(*) FROM test_users");
4    
5    assertThat(result.getExitCode()).isEqualTo(0);
6    System.out.println("ybsql output: " + result.getStdout());
7}

Advanced Configuration Options

The Yellowbrick Test Container provides extensive configuration options:

 1@Container
 2static YellowbrickContainer yellowbrick = YellowbrickContainer.create()
 3    .withDatabaseName("testdb")           // Custom database name
 4    .withUsername("testuser")             // Custom username
 5    .withPassword("testpass")             // Custom password
 6    .withBootstrapData("yellowbrick/sql/init.sql")  // Initialize with SQL script
 7    .withMemory(16)                       // Set memory limit (GB)
 8    .withCpuCount(8)                      // Set CPU count
 9    .withDebugMode(true)                  // Enable debug logging
10    .withStartupTimeout(20);              // Custom startup timeout (minutes)

Bootstrap Data Initialization

One of the most powerful features of the Yellowbrick Test Container is the ability to automatically initialize your database with custom SQL scripts using the withBootstrapData() method. This allows you to set up your database schema, enable features, and insert test data before your tests run.

How Bootstrap Initialization Works

The Yellowbrick container includes a sophisticated bootstrap mechanism that executes custom scripts during container startup without requiring image rebuilds. Here's how it works:

Bootstrap Execution Order:

Shell Scripts First: All *.sh scripts are executed in alphabetical order
SQL Scripts Second: All *.sql files are executed in alphabetical order

Default Bootstrap Location: The container automatically looks for bootstrap files in /mnt/bootstrap/ inside the container.

Volume Mount Process: When you use withBootstrapData(), the Testcontainer framework:

Copies your classpath resources to the container
Mounts them to /mnt/bootstrap/
The container's entrypoint automatically discovers and executes them

Naming Strategy for Execution Order: Since scripts execute alphabetically, use numeric prefixes to control execution order:

1/mnt/bootstrap/
2├── 01_setup_environment.sh    # Executed first
3├── 02_configure_cluster.sh    # Executed second
4├── 10_create_database.sql     # Executed after all .sh files
5├── 20_create_schema.sql       # Executed after 10_create_database.sql
6├── 30_create_tables.sql       # Executed after 20_create_schema.sql
7└── 90_insert_data.sql         # Executed last

Setting Up Bootstrap Scripts

Create your initialization scripts in your test resources directory with proper naming for execution order:

Example Directory Structure:

1src/test/resources/yellowbrick/bootstrap/
2├── 01_setup_environment.sh
3├── 10_create_database.sql
4├── 20_create_schema.sql
5├── 30_create_tables.sql
6└── 40_insert_data.sql

1. Environment Setup Script (01_setup_environment.sh):

 1#!/bin/bash
 2echo "Setting up Yellowbrick environment..."
 3
 4# Set environment variables for subsequent SQL scripts
 5export PGPASSWORD=$YBPASSWORD
 6
 7# Log current cluster status
 8echo "Checking cluster status..."
 9ybsql -c "SELECT state FROM sys.cluster;"
10
11# Verify Yellowbrick is ready
12echo "Yellowbrick environment setup complete"

2. Database Creation (10_create_database.sql):

1-- Create UTF8 database for proper character encoding support
2CREATE DATABASE test_analytics 
3WITH ENCODING 'UTF8' ;

-- TODO ADD ENABLE JSON 3. Schema Setup (20_create_schema.sql):

1-- Connect to the new database
2\connect test_analytics;
3
4-- Create schema for analytics data
5CREATE SCHEMA IF NOT EXISTS analytics;
6CREATE SCHEMA IF NOT EXISTS reporting;
7
8-- Set search path to include our schemas
9SET search_path TO analytics, reporting, public;

4. Table Creation (30_create_tables.sql):

 1-- Create table with JSON support and proper distribution
 2CREATE TABLE analytics.user_events (
 3    id VARCHAR(36) DEFAULT SYS.GEN_RANDOM_UUID(),
 4    user_id INTEGER NOT NULL,
 5    event_type VARCHAR(50) NOT NULL,
 6    event_data VARCHAR(8000), -- Using VARCHAR for JSON-like data
 7    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
 8    session_id VARCHAR(100)
 9) DISTRIBUTE ON (user_id);
10
11-- Create table for aggregated metrics
12CREATE TABLE analytics.daily_metrics (
13    metric_date DATE NOT NULL,
14    metric_name VARCHAR(100) NOT NULL,
15    metric_value NUMERIC(15,2),
16    metadata VARCHAR(8000), -- Using VARCHAR for JSON-like data
17    PRIMARY KEY (metric_date, metric_name)
18) DISTRIBUTE ON (metric_date);
19
20-- Create reporting view table
21CREATE TABLE reporting.user_summary (
22    user_id INTEGER PRIMARY KEY,
23    total_events INTEGER DEFAULT 0,
24    first_seen TIMESTAMP,
25    last_seen TIMESTAMP,
26    user_tier VARCHAR(20) DEFAULT 'standard'
27) DISTRIBUTE ON (user_id);

5. Data Insertion (40_insert_data.sql):

 1-- Insert sample test data
 2INSERT INTO analytics.user_events (user_id, event_type, event_data, session_id) VALUES
 3(1, 'login', '{"source": "web", "browser": "chrome"}', 'session_001'),
 4(2, 'purchase', '{"amount": 99.99, "currency": "USD", "items": [{"id": 123, "name": "Product A"}]}', 'session_002'),
 5(1, 'logout', '{"duration_minutes": 45}', 'session_001'),
 6(3, 'signup', '{"referral": "google", "plan": "premium"}', 'session_003');
 7
 8-- Insert sample metrics data
 9INSERT INTO analytics.daily_metrics (metric_date, metric_name, metric_value, metadata) VALUES
10('2024-01-01', 'daily_active_users', 1250, '{"calculation_method": "unique_logins"}'),
11('2024-01-01', 'total_revenue', 15000.50, '{"currency": "USD", "includes_tax": true}'),
12('2024-01-02', 'daily_active_users', 1380, '{"calculation_method": "unique_logins"}'),
13('2024-01-02', 'total_revenue', 18500.75, '{"currency": "USD", "includes_tax": true}');
14
15-- Populate reporting summary
16INSERT INTO reporting.user_summary (user_id, total_events, first_seen, last_seen, user_tier)
17SELECT 
18    user_id,
19    COUNT(*) as total_events,
20    MIN(timestamp) as first_seen,
21    MAX(timestamp) as last_seen,
22    CASE 
23        WHEN COUNT(*) > 5 THEN 'premium'
24        WHEN COUNT(*) > 2 THEN 'standard'
25        ELSE 'basic'
26    END as user_tier
27FROM analytics.user_events
28GROUP BY user_id;
29
30-- Grant necessary permissions
31GRANT ALL PRIVILEGES ON SCHEMA analytics TO ybdadmin;
32GRANT ALL PRIVILEGES ON SCHEMA reporting TO ybdadmin;
33GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA analytics TO ybdadmin;
34GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA reporting TO ybdadmin;

Configuring the Container with Bootstrap Data

1@Container
2static YellowbrickContainer yellowbrick = YellowbrickContainer.create()
3    .withBootstrapData("yellowbrick/bootstrap/")    // Point to your bootstrap directory
4    .withDatabaseName("test_analytics")             // Match the database created in scripts
5    .withLogConsumer(outputFrame -> 
6        System.out.print("[YELLOWBRICK] " + outputFrame.getUtf8String()));

Alternative: Single File Bootstrap

1@Container
2static YellowbrickContainer yellowbrick = YellowbrickContainer.create()
3    .withBootstrapData("yellowbrick/sql/init.sql")  // Point to single SQL script
4    .withDatabaseName("test_analytics");

Bootstrap Execution Process

When the container starts, the following happens:

Mount Phase: Testcontainers copies your classpath resources to /mnt/bootstrap/ in the container
Discovery Phase: The container entrypoint scans /mnt/bootstrap/ for executable files
Shell Execution Phase: All *.sh files are executed in alphabetical order with environment variables available
SQL Execution Phase: All *.sql files are executed in alphabetical order using ybsql
Completion: Container is marked as ready for your tests

Environment Variables Available in Scripts:

YBUSER: Database username (default: ybdadmin)
YBPASSWORD: Database password (default: ybdadmin)
YBDATABASE: Database name (default: yellowbrick)
PGPASSWORD: Set to YBPASSWORD for PostgreSQL tools compatibility

Advanced Bootstrap Scenarios

Conditional Execution in Shell Scripts:

 1#!/bin/bash
 2# 01_conditional_setup.sh
 3
 4# Check if specific table exists before creating
 5TABLE_EXISTS=$(ybsql -t -c "SELECT COUNT(*) FROM information_schema.tables WHERE table_name='user_events';" | tr -d ' ')
 6if [[ "$TABLE_EXISTS" == "0" ]]; then
 7    echo "Creating user_events table..."
 8    ybsql -f /mnt/bootstrap/30_create_tables.sql
 9else
10    echo "user_events table already exists, skipping creation"
11fi

Error Handling in Bootstrap Scripts:

 1#!/bin/bash
 2# 02_error_handling.sh
 3
 4set -e  # Exit on any error
 5
 6echo "Starting database configuration..."
 7
 8# Function to handle errors
 9handle_error() {
10    echo "Error occurred in bootstrap script: $1"
11    echo "Bootstrap failed at step: $2"
12    exit 1
13}
14
15# Test database connectivity
16ybsql -c "SELECT 1;" || handle_error "Database connectivity test failed" "connectivity_check"
17
18echo "Database bootstrap completed successfully"

Testing with Bootstrap Data

Now your tests can immediately work with the pre-configured database. The Yellowbrick Test Container provides two distinct approaches for executing queries and validating results:

1. JDBC Template Approach - Structured Data Access The jdbcTemplate uses standard JDBC connections and returns structured Java objects (Lists, Maps, etc.). This approach is ideal for data validation and integration with Spring applications:

 1@Test
 2void shouldQueryBootstrapJsonData() {
 3    // jdbcTemplate returns structured data as Java collections
 4    List<Map<String, Object>> events = jdbcTemplate.queryForList("""
 5        SELECT user_id, event_type, event_data, timestamp
 6        FROM analytics.user_events 
 7        WHERE event_data LIKE '%"source": "web"%'
 8        ORDER BY timestamp
 9    """);
10
11    // Data is returned as Java objects for easy assertion
12    assertThat(events).hasSize(1);
13    assertThat(events.get(0).get("event_type")).isEqualTo("login");
14    
15    // Access JSON-like data as String for further processing
16    String eventData = (String) events.get(0).get("event_data");
17    assertThat(eventData).contains("\"browser\": \"chrome\"");
18}
19
20@Test
21void shouldQueryAggregatedMetrics() {
22    // jdbcTemplate handles complex data types like BigDecimal automatically
23    List<Map<String, Object>> metrics = jdbcTemplate.queryForList("""
24        SELECT metric_date, 
25               SUM(CASE WHEN metric_name = 'total_revenue' THEN metric_value ELSE 0 END) as revenue,
26               MAX(CASE WHEN metric_name = 'daily_active_users' THEN metric_value ELSE 0 END) as dau
27        FROM analytics.daily_metrics
28        GROUP BY metric_date
29        ORDER BY metric_date
30    """);
31
32    assertThat(metrics).hasSize(2);
33    
34    // Data types are properly converted (BigDecimal, Integer, etc.)
35    assertThat(metrics.get(0).get("revenue")).isEqualTo(new BigDecimal("15000.50"));
36    assertThat(((Number) metrics.get(0).get("dau")).intValue()).isEqualTo(1250);
37}

2. ybsql Command Approach - Text-Based Results The yellowbrick.executeQuery() method executes commands directly using Yellowbrick's native ybsql client and returns raw text output. This approach is useful for testing Yellowbrick-specific features and system-level operations:

 1@Test
 2void shouldTestYellowbrickDistributionWithBootstrapData() throws Exception {
 3    // executeQuery() uses ybsql and returns raw text output
 4    var result = yellowbrick.executeQuery("""
 5        SELECT t.name, t.distribution_key 
 6        FROM sys.table t, sys.schema s
 7        where s.schema_id = t.schema_id
 8        and s.name = 'analytics'
 9        ORDER BY t.name
10    """);
11
12    // Check command execution success
13    assertThat(result.getExitCode()).isEqualTo(0);
14    
15    // Parse text output for validation
16    String output = result.getStdout();
17    assertThat(output).contains("user_events");
18    assertThat(output).contains("user_id");
19    assertThat(output).contains("daily_metrics");
20    assertThat(output).contains("metric_date");
21}
22
23@Test
24void shouldTestClusterStatus() throws Exception {
25    // Use ybsql for Yellowbrick-specific system queries
26    var result = yellowbrick.executeQuery("SELECT state FROM sys.cluster;");
27    
28    assertThat(result.getExitCode()).isEqualTo(0);
29    assertThat(result.getStdout()).contains("RUNNING");
30    
31    // ybsql output includes formatting and headers, unlike JDBC
32    System.out.println("Cluster status output:");
33    System.out.println(result.getStdout());
34}
35
36@Test
37void shouldExecuteYellowbrickCommands() throws Exception {
38    // Execute administrative commands that may not be available via JDBC
39    var result = yellowbrick.executeQuery("SHOW TABLES;");
40    
41    assertThat(result.getExitCode()).isEqualTo(0);
42    
43    // Text output requires string-based validation
44    String tables = result.getStdout();
45    assertThat(tables).containsIgnoringCase("user_events");
46    assertThat(tables).containsIgnoringCase("daily_metrics");
47}

Key Differences Between the Two Approaches:

Aspect	jdbcTemplate	yellowbrick.executeQuery()
Connection Method	JDBC driver (PostgreSQL protocol)	Native ybsql command
Return Type	Java objects (List, Map, BigDecimal, etc.)	Raw text string
Data Processing	Automatic type conversion	Manual string parsing required
Use Case	Application integration testing	System-level and admin operations
Yellowbrick Features	Limited to JDBC-compatible features	Full access to Yellowbrick-specific commands
Error Handling	JDBC exceptions	Exit codes + stderr text
Performance	Connection pooling available	Command execution overhead

When to Use Each Approach:

Use jdbcTemplate for:

Testing application data access logic
Validating business data and calculations
Integration with Spring Data/JPA repositories
Complex data type handling (JSON, NUMERIC, TIMESTAMP)
Performance-sensitive operations with connection pooling

Use yellowbrick.executeQuery() for:

Testing Yellowbrick-specific features (distribution, system tables)
Administrative operations and cluster management
Commands not available through JDBC
Debugging and system diagnostics
Testing native Yellowbrick SQL extensions

Bootstrap Data Best Practices

1. Use Numeric Prefixes for Execution Order:

101_setup_environment.sh    # First shell script
202_configure_cluster.sh    # Second shell script  
310_create_database.sql     # First SQL script
420_create_schema.sql       # Second SQL script
530_create_tables.sql       # Third SQL script
690_insert_data.sql         # Last SQL script

2. Handle Encoding in Database Creation: Always specify UTF8 encoding for proper internationalization support:

1CREATE DATABASE test_db WITH ENCODING 'UTF8' LC_COLLATE='en_US.UTF-8' LC_CTYPE='en_US.UTF-8';

3. Use Shell Scripts for Environment Setup: Handle complex logic, conditionals, and environment configuration in .sh scripts:

1#!/bin/bash
2# Check Yellowbrick readiness before proceeding
3while ! ybsql -c "SELECT state FROM sys.cluster;" | grep -q "RUNNING"; do
4    echo "Waiting for Yellowbrick cluster to be ready..."
5    sleep 5
6done

4. Enable Required Features in Separate Scripts: Include any features your application needs:

1-- 15_enable_features.sql
2-- Note: Yellowbrick may not support all PostgreSQL extensions
3-- Focus on Yellowbrick-native functionality instead

5. Use Yellowbrick Distribution Strategies: Always specify distribution strategies for optimal performance:

1CREATE TABLE my_table (...) DISTRIBUTE ON (partition_column);

6. Include Realistic Sample Data: Provide test data that covers your use cases:

1-- Use meaningful test data that reflects real-world scenarios
2INSERT INTO analytics.user_events (user_id, event_type, event_data) VALUES
3(1, 'page_view', '{"page": "/dashboard", "referrer": "direct"}'),
4(1, 'feature_click', '{"feature": "export_data", "location": "header"}');

7. Set Proper Permissions: Ensure your test user has necessary permissions:

1GRANT ALL PRIVILEGES ON SCHEMA my_schema TO ybdadmin;
2GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA my_schema TO ybdadmin;

8. Make Scripts Idempotent: Use IF NOT EXISTS and similar constructs to allow re-running:

1CREATE SCHEMA IF NOT EXISTS analytics;
2CREATE TABLE IF NOT EXISTS analytics.events (...);

Multiple Bootstrap Files Support

You can bootstrap with directory structures containing both shell and SQL scripts:

1@Container
2static YellowbrickContainer yellowbrick = YellowbrickContainer.create()
3    .withBootstrapData("yellowbrick/bootstrap/")  // Point to directory with mixed script types
4    .withDatabaseName("test_analytics");

Recommended Directory Structure:

1src/test/resources/yellowbrick/bootstrap/
2├── 01_setup_environment.sh      # Environment and cluster validation
3├── 02_configure_yellowbrick.sh  # Yellowbrick-specific configuration  
4├── 10_create_database.sql       # Database creation
5├── 20_create_schema.sql         # Schema and extension setup
6├── 30_create_tables.sql         # Table definitions with distribution
7├── 40_create_views.sql          # Views and computed tables
8├── 50_create_functions.sql      # Custom functions (if supported)
9└── 90_insert_data.sql           # Sample data insertion

Execution Flow:

01_setup_environment.sh → 02_configure_yellowbrick.sh (all .sh files alphabetically)
10_create_database.sql → 20_create_schema.sql → 30_create_tables.sql → 40_create_views.sql → 50_create_functions.sql → 90_insert_data.sql (all .sql files alphabetically)

The bootstrap data feature ensures your tests start with a fully configured Yellowbrick environment, including UTF8 support, JSON capabilities, and realistic test data that matches your production schema. The combination of shell scripts for environment setup and SQL scripts for database structure provides maximum flexibility for complex initialization scenarios.

Best Practices

Performance Optimization

Use Static Containers: Share container instances across test methods using static to avoid restart overhead.
Set Appropriate Timeouts: Yellowbrick requires extended startup time, so configure generous timeouts.
Resource Allocation: Allocate sufficient memory and CPU for optimal performance.

Test Isolation

Clean Data Between Tests: Use @DirtiesContext or manual cleanup to ensure test isolation.
Transaction Rollback: Consider using @Transactional with rollback for faster cleanup.

CI/CD Integration

Docker Requirements: Ensure your CI environment supports Docker and privileged containers.
Resource Limits: Configure appropriate memory and CPU limits for CI environments.
Parallel Execution: Be cautious with parallel test execution due to resource requirements.

Troubleshooting Common Issues

Container Startup Issues

1// Add comprehensive logging
2@Container
3static YellowbrickContainer yellowbrick = YellowbrickContainer.create()
4    .withLogConsumer(outputFrame -> 
5        System.out.print("[YELLOWBRICK] " + outputFrame.getUtf8String()))
6    .withStartupTimeout(Duration.ofMinutes(20));

Conclusion

The Yellowbrick Test Container represents a significant step forward in integration testing for applications using Yellowbrick Database. By providing a real database instance in your test environment, you can:

Test with Confidence: Verify your application works with actual Yellowbrick features and behavior
Catch Issues Early: Identify database-specific problems before they reach production
Simplify Development: Eliminate the need for complex local database setups
Improve CI/CD: Create reliable, reproducible test pipelines

Whether you're building analytics applications, data warehouses, or any system that relies on Yellowbrick's powerful capabilities, the Yellowbrick Test Container provides the foundation for robust, reliable testing.

Ready to get started? Clone the repository, install the dependency, and begin testing with confidence today!

The Yellowbrick Test Container is actively developed by the Yellowbrick Spring AI Team. For questions, issues, or contributions, please refer to the project repository.

Posts in this Series