MCP server

The Model Context Protocol (MCP) is an open standard that enables AI agents to interact with external data sources and tools. Squirrels provides a built-in MCP server that allows AI agents to discover and query datasets in your project.

Overview

When you run the Squirrels API server, it automatically starts an MCP server. This server exposes your datasets and parameters as tools and resources that an AI agent (like Claude or ChatGPT) can use to explore and analyze your data.

Authentication

Whether MCP requests require authentication depends on your squirrels.yml settings:

If auth_type is optional, MCP requests can be made without authentication (the user is treated as a guest).
If auth_type is required, authentication is required.
- If auth_strategy is managed, you can authenticate using the same mechanisms as the REST API (session / API key).
- If auth_strategy is external, the MCP server requires an Authorization: Bearer <token> header with a provider-issued token. If missing/invalid, the server responds with HTTP 401 and includes a WWW-Authenticate header that points clients to /.well-known/oauth-protected-resource.

Feature flags

The MCP server supports feature flags that can be provided via the x-feature-flags header in all MCP requests. This header accepts a comma-delimited list of feature flag names. Currently, the only available feature flag is mcp-full-dataset-v1. More details about this feature flag is available below in the Dataset results behavior section.

Resources

Resources are data entities that the AI agent can read. The Squirrels MCP server exposes the following resource:

`sqrl://data-catalog`

Provides the details of all datasets and parameters that the current user has access to. This is useful for the AI agent to understand the structure of the project at the beginning of a conversation.

Tools

Tools are functions that the AI agent can call. Each tool name is prefixed with get_ and suffixed with _from_{project_name}. The response structure of the tools is the same as the response structure of the corresponding REST APIs. When running a Squirrels API server, you can find the swagger documentation at the path /analytics/{project_name}/v{project_version}/docs.

`get_data_catalog`

Used to retrieve the data catalog. This tool provides the same information as the sqrl://data-catalog resource but as a tool call.

`get_dataset_parameters`

Used to get updates for dataset parameters when a selection is made on a parameter with trigger_refresh: true. This is used for cascading parameters.

Hide arguments

dataset

string

required

The name of the dataset whose parameters are being updated.

selected_ids

string

required

A JSON string containing exactly one key-value pair, where the key is the parent parameter name and the value is the selected ID(s).

`get_dataset_results`

Used to retrieve the results of a dataset based on parameter selections.

Hide arguments

dataset

string

required

The name of the dataset to query.

parameters

string

required

A JSON string containing key-value pairs of parameter names and their selected values.

sql_query

string

An optional Polars SQL query to execute on the dataset result (use table name result).

orientation

string

default:"rows"

The orientation of the result: rows, columns, or records.

The default value here is rows, which is different than the default value for the x_orientation request field of the dataset results API (records).The rows orientation is the most token efficient without sacrificing LLM interpretability.

offset

integer

default:"0"

The number of rows to skip.

limit

integer

The maximum number of rows to return in the content field. Must be less than or equal to the value of SQRL_DATASETS__MAX_ROWS_FOR_AI. Defaults to 10, or the value of SQRL_DATASETS__MAX_ROWS_FOR_AI if it is less than 10.

Dataset results behavior

The get_dataset_results tool returns a result containing two main fields: content and structuredContent.

Content vs Structured content

content: This field contains a text representation of the dataset result. It always respects the offset and limit arguments. This is what the large language model (LLM) typically “sees” (as input tokens) and uses for its response.
structuredContent: This field contains the raw JSON data of the result. Can be used for further processing with code execution for instance. By default, it also respects offset and limit. However, its behavior can be modified by feature flags.

Feature flag: `mcp-full-dataset-v1`

When the mcp-full-dataset-v1 feature flag is enabled, the structuredContent field will contain the full dataset result, ignoring the offset and limit arguments. The content field will still be paginated based on offset and limit, allowing the LLM to see a preview of the data while the client application can access the full dataset for further processing (like chart rendering or code execution).

Even when using this feature flag, AI agents can still paginate the structuredContent result using the OFFSET and LIMIT clauses in the sql_query argument.

Environment variables

`SQRL_DATASETS__MAX_ROWS_FOR_AI`

This environment variable controls the maximum number of rows that the MCP server will return for AI tools.

Default value: 100
Purpose: Prevents the LLM from being overwhelmed by large datasets and protects against excessive token usage.
Enforcement: If a tool call specifies a limit greater than this value, an error is returned. The default limit for the tool is also derived from this value (capped at 10).

`SQRL_DATASETS__MAX_ROWS_OUTPUT`

This environment variable controls the maximum number of rows that the MCP server can return in structuredContent for the get_dataset_results tool when the mcp-full-dataset-v1 feature flag is enabled.

Default value: 100,000
Purpose: Prevents excessive server memory usage and ensures results are reasonable to send over HTTP, even when the full dataset is requested by the client.
Enforcement: This limit is applied to the structuredContent field. If the result exceeds this limit, an error is returned.

See the Environment variables page for more details on these environment variables.

Get started

Concepts

Project files

Overview

Authentication

Feature flags

Resources

`sqrl://data-catalog`

Tools

`get_data_catalog`

`get_dataset_parameters`

`get_dataset_results`

Dataset results behavior

Content vs Structured content

Feature flag: `mcp-full-dataset-v1`

Environment variables

`SQRL_DATASETS__MAX_ROWS_FOR_AI`

`SQRL_DATASETS__MAX_ROWS_OUTPUT`

Get started

Concepts

Project files

​Overview

​Authentication

​Feature flags

​Resources

​sqrl://data-catalog

​Tools

​get_data_catalog

​get_dataset_parameters

​get_dataset_results

​Dataset results behavior

​Content vs Structured content

​Feature flag: mcp-full-dataset-v1

​Environment variables

​SQRL_DATASETS__MAX_ROWS_FOR_AI

​SQRL_DATASETS__MAX_ROWS_OUTPUT

Overview

Authentication

Feature flags

Resources

`sqrl://data-catalog`

Tools

`get_data_catalog`

`get_dataset_parameters`

`get_dataset_results`

Dataset results behavior

Content vs Structured content

Feature flag: `mcp-full-dataset-v1`

Environment variables

`SQRL_DATASETS__MAX_ROWS_FOR_AI`

`SQRL_DATASETS__MAX_ROWS_OUTPUT`