Skip to main content

Dialect Architecture: Separation to Unity

· 5 min read
ZhaoYongChun
Maintainers
Hint

This article is generated by AI translation.

As dbVisitor expanded from RDBMS to NoSQL, the dialect system's abstractions became fragmented. This post explains a deep architectural refactoring that unifies dialect metadata and command building into a cohesive design — no functional changes, pure structural improvement.

Background: Pain Points of the Old Architecture

Before the refactoring, the dialect layer design of dbVisitor adopted the principle of separation of duties, mainly consisting of two parallel interface systems:

  1. SqlDialect: Responsible for defining the static characteristics and metadata of the database. For example: left and right escape characters, keyword sets, splicing patterns of pagination statements, formatting rules for table names/column names, etc. It is usually a stateless singleton.
  2. SqlCommandBuilder (and its subclasses MongoCommandBuilder, etc.): Responsible for dynamically constructing query commands. It holds the context of the query (which columns to SELECT, what the WHERE conditions are), and finally generates BoundSql. It is a stateful object.

Problems

Although this separation follows the single responsibility principle, it exposes obvious problems in actual extension and maintenance:

  • Abstract Fragmentation: When we want to adapt to a new database (such as TiDB), implementing a TiDBDialect is easy, but if its SQL syntax is special, we may need to modify the generic SqlCommandBuilder or even inherit a new Builder. For non-SQL data sources like MongoDB, the situation is worse: we need to create a specific MongoCommandBuilder, and must hardcode the judgment logic in the upper layer code (such as LambdaTemplate) to decide which Builder to instantiate.
  • Cumbersome API Usage: Users or upper-layer frameworks must explicitly perform "pairing" when building queries.
    • MySQL scenario: new SqlCommandBuilder(new MySqlDialect())
    • Mongo scenario: new MongoCommandBuilder(new MongoDialect())
  • Redundant Intermediate Classes: In order to adapt to NoSQL, we introduced glue code like MongoBuilderDialect, solely to glue Dialect and Builder together, which increased the complexity of the codebase.

Evolution: Dialect as a Factory

The core concept of this refactoring is: The dialect object itself should be the factory of the builder.

If SqlDialect defines "what" the database is (metadata), then the CommandBuilder instance produced by it is responsible for solving "how to do" (constructing queries).

Core Changes

  1. Introducing Factory Method: We introduced the newBuilder() method in SqlDialect (and its sub-interfaces/abstract classes). Any dialect implementation must have the ability to create a builder capable of understanding that dialect.

  2. Prototype Pattern: We gave the Dialect implementation class a "dual identity":

    • As a metadata object (singleton): Such as MySqlDialect.DEFAULT, stateless, providing common information such as keyword definitions.
    • As a builder object (prototype): When MySqlDialect.DEFAULT.newBuilder() is called, it returns a new MySqlDialect instance (or a specialized inner class instance). This new instance holds the query state (table, where, columns...).
    // Simplified schematic of MySqlDialect after refactoring
    public class MySqlDialect extends AbstractSqlDialect {
    // Metadata definitions...

    @Override
    public SqlCommandBuilder newBuilder() {
    // Return a new instance for building SQL
    return new MySqlDialect();
    }
    }
  3. Reorganization and Simplification of Inheritance Hierarchy: We completely removed the independent SqlCommandBuilder class file and sank its logic into the abstract base class. The new hierarchy is as follows:

    • AbstractBuilderDialect: The top-level base class, defining common Builder behaviors.
    • AbstractSqlDialect: (Core) Inherits from the former, implementing the generation logic of standard JDBC SQL (SELECT/UPDATE/INSERT...). All standard SQL databases (MySQL, PG, Oracle, etc.) inherit from this base class.
    • MongoDialect: Directly inherits from AbstractBuilderDialect, internally implementing the construction logic for MongoDB BSON. Completely removed MongoCommandBuilder and MongoBuilderDialect from the old version.
    • AbstractElasticDialect: Provides DSL construction support for ES.

Advantages After Transformation

1. Minimalist and Safe API

For upper-layer callers (such as LambdaTemplate), obtaining a builder becomes extremely unified and simple. There is no longer a need for instanceof judgments, nor is there a need to pass Dialect parameters during construction:

// Old way (pseudocode): Logic is scattered and redundant
CommandBuilder builder;
if (dialect instanceof MongoDialect) {
builder = new MongoCommandBuilder();
} else {
builder = new SqlCommandBuilder();
}
// Need explicit association, and even pass it again during build, risk of mismatch exists
BoundSql sql = builder.buildSelect(dialect, true);

// New way: Unified polymorphism, self-contained
CommandBuilder builder = dialect.newBuilder();
// The builder itself is a form of Dialect, no need to pass parameters, eliminating "mismatch"
BoundSql sql = builder.buildSelect(true);

2. Improved Cohesion

All database-specific logic—whether it is "what are the escape characters" or "how to generate INSERT statements"—is now converged in the same class (or its parent class). For example, MongoDialect is now a self-contained unit that knows both Mongo keywords and how to generate Mongo queries.

3. Eliminated Risk of Dialect Mismatch

In the old version, SqlCommandBuilder required a SqlDialect object to be passed in when generating SQL. This left a hidden danger in API design: theoretically, you could create a MongoCommandBuilder but pass it a MySqlDialect, which would lead to runtime errors or absurd query construction. After refactoring, the builder is directly produced by the dialect, and methods like buildSelect no longer receive Dialect parameters. The builder "comes with" metadata knowledge, eliminating the possibility of dialect mismatch from the compilation level.

4. Reduced Code Volume and Improved Maintainability

Through this refactoring, we deleted multiple redundant Builder classes and adapter classes. Test cases have also become more generic: we can write a set of tests for dialect.newBuilder(), and then run it with different Dialect implementations, only needing to verify the generated BoundSql string.

Upgrade Guide

For ordinary users of dbVisitor, this refactoring is completely transparent, and the API remains backward compatible. For advanced users developing custom Dialects, if you previously relied on the SqlCommandBuilder class, please change it to inherit AbstractSqlDialect and override the newBuilder() method.