Skip to content

API Reference

Tools

MCP BigQuery provides eleven comprehensive tools for SQL validation, analysis, and schema discovery.

bq_validate_sql

Validates BigQuery SQL syntax without executing the query.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to validate"
    },
    "params": {
      "type": "object",
      "description": "Optional query parameters",
      "additionalProperties": true
    }
  },
  "required": ["sql"]
}

Parameters:

Parameter Type Required Description
sql string Yes The SQL query to validate
params object No Query parameters as key-value pairs

Response (Success):

{
  "isValid": true
}

Response (Error):

{
  "isValid": false,
  "error": {
    "code": "INVALID_SQL",
    "message": "Error description",
    "location": {
      "line": 1,
      "column": 15
    }
  }
}

bq_dry_run_sql

Performs a dry-run analysis to estimate costs and preview schema.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to analyze"
    },
    "params": {
      "type": "object",
      "description": "Optional query parameters",
      "additionalProperties": true
    },
    "pricePerTiB": {
      "type": "number",
      "description": "Price per TiB for cost estimation",
      "default": 5.0
    }
  },
  "required": ["sql"]
}

Parameters:

Parameter Type Required Default Description
sql string Yes - The SQL query to analyze
params object No {} Query parameters
pricePerTiB number No 5.0 USD price per TiB

Response (Success):

{
  "totalBytesProcessed": 1073741824,
  "usdEstimate": 0.005,
  "referencedTables": [
    {
      "project": "project-id",
      "dataset": "dataset_name",
      "table": "table_name"
    }
  ],
  "schemaPreview": [
    {
      "name": "column_name",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": "Column description"
    }
  ]
}

Response (Error):

{
  "error": {
    "code": "INVALID_SQL",
    "message": "Error description",
    "details": []
  }
}

bq_analyze_query_structure

Analyzes SQL query structure and complexity.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to analyze"
    },
    "params": {
      "type": "object",
      "description": "Optional query parameters"
    }
  },
  "required": ["sql"]
}

Response:

{
  "query_type": "SELECT",
  "has_joins": true,
  "has_subqueries": false,
  "has_cte": true,
  "has_aggregations": true,
  "has_window_functions": false,
  "table_count": 2,
  "complexity_score": 25,
  "join_types": ["LEFT", "INNER"],
  "functions_used": ["COUNT", "SUM"]
}

bq_extract_dependencies

Extracts table and column dependencies from SQL.

Response:

{
  "tables": [
    {
      "project": "my-project",
      "dataset": "my_dataset",
      "table": "my_table",
      "full_name": "my-project.my_dataset.my_table"
    }
  ],
  "columns": ["id", "name", "created_at"],
  "dependency_graph": {
    "my_dataset.my_table": ["id", "name", "created_at"]
  },
  "table_count": 1,
  "column_count": 3
}

bq_validate_query_syntax

Enhanced syntax validation with detailed error reporting.

Response:

{
  "is_valid": true,
  "issues": [
    {
      "type": "performance",
      "message": "SELECT * may impact performance",
      "severity": "warning"
    }
  ],
  "suggestions": [
    "Specify exact columns needed instead of using SELECT *"
  ],
  "bigquery_specific": {
    "uses_legacy_sql": false,
    "has_array_syntax": false,
    "has_struct_syntax": true
  }
}

bq_list_datasets

Lists all datasets in the BigQuery project.

Input Schema:

{
  "type": "object",
  "properties": {
    "project_id": {
      "type": "string",
      "description": "GCP project ID (uses default if not provided)"
    },
    "max_results": {
      "type": "integer",
      "description": "Maximum number of datasets to return"
    }
  }
}

Response:

{
  "project": "my-project",
  "dataset_count": 2,
  "datasets": [
    {
      "dataset_id": "analytics",
      "project": "my-project",
      "location": "US",
      "created": "2024-01-01T00:00:00",
      "modified": "2024-06-01T00:00:00",
      "description": "Analytics data",
      "labels": {"env": "production"}
    }
  ]
}

bq_list_tables

Lists all tables in a BigQuery dataset.

Input Schema:

{
  "type": "object",
  "properties": {
    "dataset_id": {
      "type": "string",
      "description": "The dataset ID"
    },
    "project_id": {
      "type": "string",
      "description": "GCP project ID"
    },
    "max_results": {
      "type": "integer"
    },
    "table_type_filter": {
      "type": "array",
      "items": {"type": "string"},
      "description": "Filter by table types (TABLE, VIEW, EXTERNAL, MATERIALIZED_VIEW)"
    }
  },
  "required": ["dataset_id"]
}

Response:

{
  "dataset_id": "analytics",
  "project": "my-project",
  "table_count": 3,
  "tables": [
    {
      "table_id": "users",
      "table_type": "TABLE",
      "num_bytes": 1048576,
      "num_rows": 10000,
      "partitioning": {
        "type": "DAY",
        "field": "created_at"
      },
      "clustering_fields": ["user_id"]
    }
  ]
}

bq_describe_table

Gets detailed table schema and metadata.

Input Schema:

{
  "type": "object",
  "properties": {
    "table_id": {"type": "string"},
    "dataset_id": {"type": "string"},
    "project_id": {"type": "string"},
    "format_output": {
      "type": "boolean",
      "description": "Format schema as table"
    }
  },
  "required": ["table_id", "dataset_id"]
}

Response:

{
  "table_id": "users",
  "dataset_id": "analytics",
  "table_type": "TABLE",
  "schema": [
    {
      "name": "user_id",
      "type": "INTEGER",
      "mode": "REQUIRED",
      "description": "Unique identifier"
    },
    {
      "name": "address",
      "type": "RECORD",
      "mode": "NULLABLE",
      "fields": [
        {
          "name": "street",
          "type": "STRING",
          "mode": "NULLABLE"
        }
      ]
    }
  ],
  "statistics": {
    "num_bytes": 1048576,
    "num_rows": 10000
  }
}

bq_get_table_info

Gets comprehensive table information.

Response:

{
  "table_id": "users",
  "dataset_id": "analytics",
  "full_table_id": "my-project.analytics.users",
  "table_type": "TABLE",
  "created": "2024-01-15T00:00:00",
  "statistics": {
    "num_bytes": 1048576,
    "num_rows": 10000,
    "num_active_logical_bytes": 786432
  },
  "time_travel": {
    "max_time_travel_hours": 168
  },
  "partitioning": {
    "type": "DAY",
    "time_partitioning": {
      "field": "created_at",
      "require_partition_filter": false
    }
  }
}

bq_query_info_schema

Queries INFORMATION_SCHEMA views for metadata.

Input Schema:

{
  "type": "object",
  "properties": {
    "query_type": {
      "type": "string",
      "enum": ["tables", "columns", "table_storage", "partitions", "views", "routines", "custom"]
    },
    "dataset_id": {"type": "string"},
    "project_id": {"type": "string"},
    "table_filter": {"type": "string"},
    "custom_query": {"type": "string"},
    "limit": {"type": "integer", "default": 100}
  },
  "required": ["query_type", "dataset_id"]
}

Response:

{
  "query_type": "columns",
  "dataset_id": "analytics",
  "query": "SELECT ... FROM INFORMATION_SCHEMA.COLUMNS ...",
  "schema": [
    {
      "name": "column_name",
      "type": "STRING",
      "mode": "NULLABLE"
    }
  ],
  "metadata": {
    "total_bytes_processed": 1024,
    "estimated_cost_usd": 0.000005
  }
}

bq_analyze_query_performance

Analyzes query performance and provides optimization suggestions.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to analyze"
    },
    "project_id": {"type": "string"}
  },
  "required": ["sql"]
}

Response:

{
  "query_analysis": {
    "bytes_processed": 107374182400,
    "gigabytes_processed": 100.0,
    "estimated_cost_usd": 0.5,
    "table_count": 1
  },
  "performance_score": 65,
  "performance_rating": "GOOD",
  "optimization_suggestions": [
    {
      "type": "SELECT_STAR",
      "severity": "MEDIUM",
      "message": "Query uses SELECT *",
      "recommendation": "Select only needed columns"
    }
  ]
}

Response Schemas

Error Object

All error responses follow this structure:

interface Error {
  code: "INVALID_SQL" | "AUTHENTICATION_ERROR" | "PERMISSION_DENIED";
  message: string;
  location?: {
    line: number;
    column: number;
  };
  details?: Array<{
    reason: string;
    location: string;
    message: string;
  }>;
}

Table Reference

interface TableReference {
  project: string;
  dataset: string;
  table: string;
}

Schema Field

interface SchemaField {
  name: string;
  type: "STRING" | "INT64" | "FLOAT64" | "BOOL" | "TIMESTAMP" | "DATE" | "TIME" | "DATETIME" | "GEOGRAPHY" | "NUMERIC" | "BIGNUMERIC" | "BYTES" | "STRUCT" | "ARRAY" | "JSON";
  mode: "NULLABLE" | "REQUIRED" | "REPEATED";
  description?: string;
  fields?: SchemaField[];  // For STRUCT types
}

Error Codes

Code Description Common Causes
INVALID_SQL SQL syntax or semantic error Typos, invalid references, wrong syntax
AUTHENTICATION_ERROR Failed to authenticate with BigQuery Missing credentials, expired tokens
PERMISSION_DENIED Insufficient permissions Missing IAM roles, dataset access

Parameter Types

Due to BigQuery dry-run limitations, all parameters are treated as STRING type. Use explicit casting in your queries:

-- Correct way to use numeric parameters
SELECT * FROM orders 
WHERE amount > CAST(@min_amount AS FLOAT64)
  AND created_date = CAST(@date AS DATE)

Rate Limits

MCP BigQuery inherits BigQuery's API quotas:

Quota Limit Scope
API requests 100/second Per project
Concurrent API requests 300 Per project
Query complexity 50 MB uncompressed Per query

Authentication

The server uses Google Cloud Application Default Credentials (ADC). See Authentication for setup details.

Environment Variables

Variable Description Default
BQ_PROJECT Default GCP project ID From ADC
BQ_LOCATION Default dataset location None
SAFE_PRICE_PER_TIB Default price per TiB 5.0
GOOGLE_APPLICATION_CREDENTIALS Path to service account key None

Usage Examples

Basic Validation

# Python example using MCP client
result = client.call_tool(
    "bq_validate_sql",
    {
        "sql": "SELECT * FROM dataset.table"
    }
)

if result["isValid"]:
    print("Query is valid")
else:
    print(f"Error: {result['error']['message']}")

Cost Estimation

result = client.call_tool(
    "bq_dry_run_sql",
    {
        "sql": "SELECT * FROM large_table",
        "pricePerTiB": 6.0  # Custom pricing
    }
)

print(f"Estimated cost: ${result['usdEstimate']:.2f}")
print(f"Bytes to scan: {result['totalBytesProcessed']:,}")

Parameter Validation

result = client.call_tool(
    "bq_validate_sql",
    {
        "sql": "SELECT * FROM users WHERE id = @user_id",
        "params": {"user_id": "12345"}
    }
)

Limitations

  1. Parameters are STRING only - Cast to appropriate types in SQL
  2. No query execution - All operations are dry-run only
  3. No data returned - Only metadata and cost estimates
  4. Cache disabled - Always returns fresh estimates
  5. No UDF creation - Can validate UDF usage but not create them

Next Steps