Java Agents, SQL, and the MCP Revolution

I admit it. I was jealous of the Python ecosystem for a solid two years. While I was busy debugging generic type erasure in my enterprise Java apps, the Python folks were having a party with LLMs, agents, and one-liner library imports that seemed to do magic. It felt like if you wanted to build AI, you had to leave the JVM behind.

That feeling is gone.

The gap hasn’t just closed; in some ways, the enterprise stability of Java is actually making it the better choice for what comes next: Agentic AI. I’ve been spending my weekends messing around with LangChain4j and the new Model Context Protocol (MCP), and honestly, the architecture feels a lot more robust—sorry, I mean stable—than the spaghetti code scripts I see floating around in Jupyter notebooks.

Here’s the specific problem I was trying to solve: I wanted an AI agent that could talk to my Oracle Database without me having to hard-code every single SQL query or expose my entire schema insecurely. I wanted the LLM to understand the data structure dynamically.

This is where the Model Context Protocol (MCP) and Oracle’s SQLcl come in. If you haven’t looked at MCP yet, you need to. It’s basically a universal standard for connecting AI models to data sources. Instead of writing a custom integration for every database, you just point the agent at an MCP server.

The Architecture: Why This Matters

The old way of doing “Chat with your Data” was pretty brittle. You’d dump your schema into the system prompt and hope the context window didn’t overflow. Or you’d use a RAG pipeline that just retrieved static chunks of text.

With an MCP architecture, the setup looks like this:

  • The Brain: A Java application running LangChain4j.
  • The Protocol: MCP (Model Context Protocol).
  • The Tool: Oracle SQLcl acting as an MCP Server.
  • The Storage: Oracle Database (specifically the modern versions like 23ai or 26ai with vector support).

The beauty here is that SQLcl creates the bridge. It exposes the database capabilities as “tools” that the LLM can call. The LLM doesn’t just guess SQL; it asks the MCP server “What tables do I have?” or “Run this query for me.”

Setting the Stage: The Database Schema

Oracle Database - Oracle Database 23ai Brings the Power of AI to Enterprise Data and ...
Oracle Database – Oracle Database 23ai Brings the Power of AI to Enterprise Data and …

Before we touch the Java code, let’s look at the database side. Since we are in 2026, we aren’t just doing simple SELECT *. We are likely mixing relational data with vector embeddings for semantic search.

I set up a scenario for a customer support agent. I needed a table that held product info but also allowed for “vibe-based” searching (semantic similarity). Here is the SQL I ran to get the schema ready. Note the VECTOR data type—if you’re still on older versions of Oracle, this won’t work, but you really should upgrade.

-- Create a table for products with a vector column for semantic search
CREATE TABLE product_catalog (
    product_id NUMBER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    product_name VARCHAR2(100) NOT NULL,
    category VARCHAR2(50),
    description VARCHAR2(1000),
    price NUMBER(10, 2),
    stock_quantity NUMBER,
    -- The embedding vector (assuming 1024 dimensions for this model)
    embedding VECTOR(1024, FLOAT32)
);

-- Insert some dummy data (embeddings would be generated by an embedding model)
INSERT INTO product_catalog (product_name, category, description, price, stock_quantity)
VALUES ('Neural Noise Cancelling Headphones', 'Electronics', 'High-fidelity audio with AI-driven noise cancellation.', 299.99, 50);

INSERT INTO product_catalog (product_name, category, description, price, stock_quantity)
VALUES ('Ergonomic Mechanical Keyboard', 'Accessories', 'Split layout keyboard with tactile brown switches.', 149.50, 120);

COMMIT;

-- Create an In-Memory Neighbor Graph Vector Index for fast similarity search
CREATE VECTOR INDEX product_embedding_idx 
ON product_catalog (embedding)
ORGANIZATION NEIGHBOR GRAPH
DISTANCE METRIC COSINE
WITH TARGET ACCURACY 95;

That VECTOR INDEX is critical. Without it, your agent is scanning the whole table every time it wants to find “headphones that are good for coding.” With the index, it’s instant.

The Java Implementation

Now for the fun part. We aren’t writing a raw JDBC connector here. We are using LangChain4j to build an agent that connects to the SQLcl MCP server.

I spent a few hours fighting with dependencies (standard Java experience, right?), but once I got the MCP client working, it was surprisingly clean. The idea is to register the MCP client as a tool provider for the AI service.

Here is a stripped-down version of the code that actually worked for me:

package com.example.ai.agent;

import dev.langchain4j.agent.tool.Tool;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.mcp.client.McpClient;
import dev.langchain4j.mcp.client.transport.StdioMcpTransport;

import java.time.Duration;
import java.util.List;

public class DatabaseAgent {

    interface DataAnalyst {
        String answer(String query);
    }

    public static void main(String[] args) {
        
        // 1. Configure the MCP Client to talk to Oracle SQLcl
        // We use Stdio transport because SQLcl runs locally as a subprocess here
        var transport = new StdioMcpTransport.Builder()
                .command(List.of("sql", "/nolog")) // Launch SQLcl
                .args(List.of("-mcp"))             // Enable MCP mode
                .build();

        var mcpClient = new McpClient.Builder()
                .transport(transport)
                .toolExecutionTimeout(Duration.ofSeconds(30))
                .build();

        // 2. Initialize the Chat Model (e.g., GPT-4o or a local model)
        var model = OpenAiChatModel.builder()
                .apiKey(System.getenv("OPENAI_API_KEY"))
                .modelName("gpt-4o")
                .build();

        // 3. Build the AI Service with the MCP tools attached
        DataAnalyst analyst = AiServices.builder(DataAnalyst.class)
                .chatLanguageModel(model)
                .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
                .tools(mcpClient.getTools()) // This automagically pulls tools from SQLcl
                .build();

        // 4. Ask a question that requires database interaction
        String question = "Find the top 3 most expensive electronics and check their stock levels.";
        
        System.out.println("User: " + question);
        String response = analyst.answer(question);
        System.out.println("Agent: " + response);
    }
}

What’s happening in that code block is actually pretty wild if you think about it. I didn’t define a “get_stock_level” tool. I didn’t write a JDBC query. I just pointed the McpClient at the sql command-line tool running in MCP mode.

When the application runs, LangChain4j asks SQLcl: “Hey, what can you do?” SQLcl replies, “I can list tables, describe schemas, and run SQL queries.” The LLM then figures out, “Okay, to answer the user’s question, I first need to see the table structure, then write a query.”

When the Agent Writes SQL (and why I trust it)

A lot of developers get nervous about letting an AI generate SQL. Valid concern. You don’t want an agent accidentally running DROP TABLE because it hallucinated a cleanup command.

Oracle Database - All about Oracle Database - Definition, Features, Benefits
Oracle Database – All about Oracle Database – Definition, Features, Benefits

Using SQLcl as the MCP server adds a layer of safety. You can configure the database user that SQLcl connects with to have read-only permissions or specific grants.

Here is an example of the interaction log I saw when I ran the code above. The agent didn’t just guess; it explored.

  1. Tool Call: list_tables()
  2. Tool Output: PRODUCT_CATALOG, ORDERS, CUSTOMERS
  3. Tool Call: describe_table("PRODUCT_CATALOG")
  4. Tool Output: Columns: PRODUCT_NAME, CATEGORY, PRICE, STOCK_QUANTITY
  5. Tool Call: run_query("SELECT product_name, price, stock_quantity FROM product_catalog WHERE category = 'Electronics' ORDER BY price DESC FETCH FIRST 3 ROWS ONLY")

It figured out the Oracle dialect (using FETCH FIRST instead of LIMIT) because the MCP server context likely hinted at the environment.

Hybrid Search: The “Killer App” for Java Agents

The real power comes when you combine traditional SQL filters with that vector column we created earlier. I modified my agent prompt to handle semantic queries, and it started generating hybrid SQL queries automatically.

For a query like “Find me cheap accessories that feel premium,” the agent generated something like this:

Oracle Database - Introducing Oracle Database@AWS for simplified Oracle Exadata ...
Oracle Database – Introducing Oracle Database@AWS for simplified Oracle Exadata …
SELECT product_name, price, description
FROM product_catalog
WHERE category = 'Accessories' 
  AND price < 50
ORDER BY VECTOR_DISTANCE(embedding, :user_query_vector, COSINE)
FETCH FIRST 5 ROWS ONLY;

Note: The agent handles the vector generation step if you wire up an embedding model provider in LangChain4j, passing the vector into the query.

Why I'm Sticking with Java for This

I’ve tried building this same flow in Python frameworks. It works, sure. But the dependency management is a nightmare, and deploying it to production always feels like rolling the dice.

With Java, LangChain4j, and Oracle's MCP server, I have type safety, a mature build system (Maven/Gradle), and a database that handles the heavy lifting of vector search natively. I don't need a separate vector database like Pinecone or Weaviate. It's all just... there. In the database I already know how to manage.

If you're a Java developer sitting on the sidelines thinking AI is a Python-only club, wake up. The tools are here, and frankly, they might be better than what the other side is using.