API Reference¶

Tools¶

MCP BigQuery provides eleven comprehensive tools for SQL validation, analysis, and schema discovery.

bq_validate_sql¶

Validates BigQuery SQL syntax without executing the query.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to validate"
    },
    "params": {
      "type": "object",
      "description": "Optional query parameters",
      "additionalProperties": true
    }
  },
  "required": ["sql"]
}

Parameters:

Parameter	Type	Required	Description
`sql`	string	Yes	The SQL query to validate
`params`	object	No	Query parameters as key-value pairs

Response (Success):

{
  "isValid": true
}

Response (Error):

{
  "isValid": false,
  "error": {
    "code": "INVALID_SQL",
    "message": "Error description",
    "location": {
      "line": 1,
      "column": 15
    }
  }
}

bq_dry_run_sql¶

Performs a dry-run analysis to estimate costs and preview schema.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to analyze"
    },
    "params": {
      "type": "object",
      "description": "Optional query parameters",
      "additionalProperties": true
    },
    "pricePerTiB": {
      "type": "number",
      "description": "Price per TiB for cost estimation",
      "default": 5.0
    }
  },
  "required": ["sql"]
}

Parameters:

Parameter	Type	Required	Default	Description
`sql`	string	Yes	-	The SQL query to analyze
`params`	object	No	{}	Query parameters
`pricePerTiB`	number	No	5.0	USD price per TiB

Response (Success):

{
  "totalBytesProcessed": 1073741824,
  "usdEstimate": 0.005,
  "referencedTables": [
    {
      "project": "project-id",
      "dataset": "dataset_name",
      "table": "table_name"
    }
  ],
  "schemaPreview": [
    {
      "name": "column_name",
      "type": "STRING",
      "mode": "NULLABLE",
      "description": "Column description"
    }
  ]
}

Response (Error):

{
  "error": {
    "code": "INVALID_SQL",
    "message": "Error description",
    "details": []
  }
}

bq_analyze_query_structure¶

Analyzes SQL query structure and complexity.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to analyze"
    },
    "params": {
      "type": "object",
      "description": "Optional query parameters"
    }
  },
  "required": ["sql"]
}

Response:

{
  "query_type": "SELECT",
  "has_joins": true,
  "has_subqueries": false,
  "has_cte": true,
  "has_aggregations": true,
  "has_window_functions": false,
  "table_count": 2,
  "complexity_score": 25,
  "join_types": ["LEFT", "INNER"],
  "functions_used": ["COUNT", "SUM"]
}

bq_extract_dependencies¶

Extracts table and column dependencies from SQL.

Response:

{
  "tables": [
    {
      "project": "my-project",
      "dataset": "my_dataset",
      "table": "my_table",
      "full_name": "my-project.my_dataset.my_table"
    }
  ],
  "columns": ["id", "name", "created_at"],
  "dependency_graph": {
    "my_dataset.my_table": ["id", "name", "created_at"]
  },
  "table_count": 1,
  "column_count": 3
}

bq_validate_query_syntax¶

Enhanced syntax validation with detailed error reporting.

Response:

{
  "is_valid": true,
  "issues": [
    {
      "type": "performance",
      "message": "SELECT * may impact performance",
      "severity": "warning"
    }
  ],
  "suggestions": [
    "Specify exact columns needed instead of using SELECT *"
  ],
  "bigquery_specific": {
    "uses_legacy_sql": false,
    "has_array_syntax": false,
    "has_struct_syntax": true
  }
}

bq_list_datasets¶

Lists all datasets in the BigQuery project.

Input Schema:

{
  "type": "object",
  "properties": {
    "project_id": {
      "type": "string",
      "description": "GCP project ID (uses default if not provided)"
    },
    "max_results": {
      "type": "integer",
      "description": "Maximum number of datasets to return"
    }
  }
}

Response:

{
  "project": "my-project",
  "dataset_count": 2,
  "datasets": [
    {
      "dataset_id": "analytics",
      "project": "my-project",
      "location": "US",
      "created": "2024-01-01T00:00:00",
      "modified": "2024-06-01T00:00:00",
      "description": "Analytics data",
      "labels": {"env": "production"}
    }
  ]
}

bq_list_tables¶

Lists all tables in a BigQuery dataset.

Input Schema:

id=__codelineno-12-1 name=__codelineno-12-1 href=#__codelineno-12-1>{ "type": "object", "properties": { "dataset_id": { "type": "string", "description": "The dataset ID" }, "project_id": { "type": "string", "description": "GCP project ID" }, "max_results": { "type": "integer" }, "table_type_filter": { "type": "array", "items": {"type": "string"}, "description": "Filter by table types (TABLE, VIEW, EXTERNAL, MATERIALIZED_VIEW)" } }, "required": ["dataset_id"] class=p>}

Response:

{
  "dataset_id": "analytics",
  "project": "my-project",
  "table_count": 3,
  "tables": [
    {
      "table_id": "users",
      "table_type": "TABLE",
      "num_bytes": 1048576,
      "num_rows": 10000,
      "partitioning": {
        "type": "DAY",
        "field": "created_at"
      },
      "clustering_fields": ["user_id"]
    }
  ]
}

bq_describe_table¶

Gets detailed table schema and metadata.

Input Schema:

{
  "type": "object",
  "properties": {
    "table_id": {"type": "string"},
    "dataset_id": {"type": "string"},
    "project_id": {"type": "string"},
    "format_output": {
      "type": "boolean",
      "description": "Format schema as table"
    }
  },
  "required": ["table_id", "dataset_id"]
}

Response:

{
  "table_id": "users",
  "dataset_id": "analytics",
  "table_type": "TABLE",
  "schema": [
    {
      "name": "user_id",
      "type": "INTEGER",
      "mode": "REQUIRED",
      "description": "Unique identifier"
    },
    {
      "name": "address",
      "type": "RECORD",
      "mode": "NULLABLE",
      "fields": [
        {
          "name": "street",
          "type": "STRING",
          "mode": "NULLABLE"
        }
      ]
    }
  ],
  "statistics": {
    "num_bytes": 1048576,
    "num_rows": 10000
  }
}

bq_get_table_info¶

Gets comprehensive table information.

Response:

{
  "table_id": "users",
  "dataset_id": "analytics",
  "full_table_id": "my-project.analytics.users",
  "table_type": "TABLE",
  "created": "2024-01-15T00:00:00",
  "statistics": {
    "num_bytes": 1048576,
    "num_rows": 10000,
    "num_active_logical_bytes": 786432
  },
  "time_travel": {
    "max_time_travel_hours": 168
  },
  "partitioning": {
    "type": "DAY",
    "time_partitioning": {
      "field": "created_at",
      "require_partition_filter": false
    }
  }
}

bq_query_info_schema¶

Queries INFORMATION_SCHEMA views for metadata.

Input Schema:

{
  "type": "object",
  "properties": {
    "query_type": {
      "type": "string",
      "enum": ["tables", "columns", "table_storage", "partitions", "views", "routines", "custom"]
    },
    "dataset_id": {"type": "string"},
    "project_id": {"type": "string"},
    "table_filter": {"type": "string"},
    "custom_query": {"type": "string"},
    "limit": {"type": "integer", "default": 100}
  },
  "required": ["query_type", "dataset_id"]
}

Response:

{
  "query_type": "columns",
  "dataset_id": "analytics",
  "query": "SELECT ... FROM INFORMATION_SCHEMA.COLUMNS ...",
  "schema": [
    {
      "name": "column_name",
      "type": "STRING",
      "mode": "NULLABLE"
    }
  ],
  "metadata": {
    "total_bytes_processed": 1024,
    "estimated_cost_usd": 0.000005
  }
}

bq_analyze_query_performance¶

Analyzes query performance and provides optimization suggestions.

Input Schema:

{
  "type": "object",
  "properties": {
    "sql": {
      "type": "string",
      "description": "The SQL query to analyze"
    },
    "project_id": {"type": "string"}
  },
  "required": ["sql"]
}

Response:

{
  "query_analysis": {
    "bytes_processed": 107374182400,
    "gigabytes_processed": 100.0,
    "estimated_cost_usd": 0.5,
    "table_count": 1
  },
  "performance_score": 65,
  "performance_rating": "GOOD",
  "optimization_suggestions": [
    {
      "type": "SELECT_STAR",
      "severity": "MEDIUM",
      "message": "Query uses SELECT *",
      "recommendation": "Select only needed columns"
    }
  ]
}

Response Schemas¶

Error Object¶

All error responses follow this structure:

interface Error {
  code: "INVALID_SQL" | "AUTHENTICATION_ERROR" | "PERMISSION_DENIED";
  message: string;
  location?: {
    line: number;
    column: number;
  };
  details?: Array<{
    reason: string;
    location: string;
    message: string;
  }>;
}

Table Reference¶

interface TableReference {
  project: string;
  dataset: string;
  table: string;
}

Schema Field¶

interface SchemaField {
  name: string;
  type: "STRING" | "INT64" | "FLOAT64" | "BOOL" | "TIMESTAMP" | "DATE" | "TIME" | "DATETIME" | "GEOGRAPHY" | "NUMERIC" | "BIGNUMERIC" | "BYTES" | "STRUCT" | "ARRAY" | "JSON";
  mode: "NULLABLE" | "REQUIRED" | "REPEATED";
  description?: string;
  fields?: SchemaField[];  // For STRUCT types
}

Error Codes¶

Code	Description	Common Causes
`INVALID_SQL`	SQL syntax or semantic error	Typos, invalid references, wrong syntax
`AUTHENTICATION_ERROR`	Failed to authenticate with BigQuery	Missing credentials, expired tokens
`PERMISSION_DENIED`	Insufficient permissions	Missing IAM roles, dataset access

Parameter Types¶

Due to BigQuery dry-run limitations, all parameters are treated as STRING type. Use explicit casting in your queries:

-- Correct way to use numeric parameters
SELECT * FROM orders 
WHERE amount > CAST(@min_amount AS FLOAT64)
  AND created_date = CAST(@date AS DATE)

Rate Limits¶

MCP BigQuery inherits BigQuery's API quotas:

Quota	Limit	Scope
API requests	100/second	Per project
Concurrent API requests	300	Per project
Query complexity	50 MB uncompressed	Per query

Authentication¶

The server uses Google Cloud Application Default Credentials (ADC). See Authentication for setup details.

Environment Variables¶

Variable	Description	Default
`BQ_PROJECT`	Default GCP project ID	From ADC
`BQ_LOCATION`	Default dataset location	None
`SAFE_PRICE_PER_TIB`	Default price per TiB	5.0
`GOOGLE_APPLICATION_CREDENTIALS`	Path to service account key	None

Usage Examples¶

Basic Validation¶

# Python example using MCP client
result = client.call_tool(
    "bq_validate_sql",
    {
        "sql": "SELECT * FROM dataset.table"
    }
)

if result["isValid"]:
    print("Query is valid")
else:
    print(f"Error: {result['error']['message']}")

Cost Estimation¶

result = client.call_tool(
    "bq_dry_run_sql",
    {
        "sql": "SELECT * FROM large_table",
        "pricePerTiB": 6.0  # Custom pricing
    }
)

print(f"Estimated cost: ${result['usdEstimate']:.2f}")
print(f"Bytes to scan: {result['totalBytesProcessed']:,}")

Parameter Validation¶

result = client.call_tool(
    "bq_validate_sql",
    {
        "sql": "SELECT * FROM users WHERE id = @user_id",
        "params": {"user_id": "12345"}
    }
)

Limitations¶

Parameters are STRING only - Cast to appropriate types in SQL
No query execution - All operations are dry-run only
No data returned - Only metadata and cost estimates
Cache disabled - Always returns fresh estimates
No UDF creation - Can validate UDF usage but not create them

Next Steps¶

See Usage Guide for detailed examples
Check Examples for real-world scenarios
Read Development for contributing