Skip to content

Development Guide

Guide for contributors and developers working on MCP BigQuery.

Development Setup

Prerequisites

  • Python 3.10+
  • Git
  • Google Cloud SDK with BigQuery API enabled

Clone and Install

# Clone repository
git clone https://github.com/caron14/mcp-bigquery.git
cd mcp-bigquery

# Install with development dependencies
pip install -e ".[dev]"

# Or using uv
uv pip install -e ".[dev]"

Environment Setup

# Set up Google Cloud authentication
gcloud auth application-default login

# Configure project
export BQ_PROJECT="your-test-project"
export BQ_LOCATION="US"

# Install pre-commit hooks
pre-commit install

# Run development server
python -m mcp_bigquery

Pre-commit Setup

This project uses pre-commit hooks to ensure code quality:

# Install pre-commit hooks (one-time setup)
pre-commit install

# Run all hooks manually
pre-commit run --all-files

# Update hook versions
pre-commit autoupdate

Configured hooks: - isort: Sorts Python imports - black: Formats Python code (line length: 100) - flake8: Checks Python code style - ruff: Fast Python linter - mypy: Type checking for Python

Project Structure

mcp-bigquery/
├── src/mcp_bigquery/
│   ├── __init__.py          # Version + exports
│   ├── __main__.py          # CLI entry point (logging flags added in v0.4.2)
│   ├── server.py            # MCP server implementation
│   ├── config.py            # Environment/config resolution
│   ├── logging_config.py    # Central log formatting + level helpers
│   ├── cache.py             # Simple BigQuery client cache
│   ├── clients/
│   │   ├── __init__.py
│   │   └── factory.py       # Shared BigQuery client creation
│   ├── schema_explorer/
│   │   ├── __init__.py
│   │   ├── datasets.py      # Dataset listing flows
│   │   ├── tables.py        # Table metadata aggregation
│   │   └── describe.py      # Schema inspection + shared formatters
│   ├── sql_analyzer.py      # SQL analysis helpers
│   ├── validators.py        # Input validation utilities
│   ├── exceptions.py        # Custom exception types
│   └── constants.py         # Shared constants/env defaults
├── tests/
│   ├── conftest.py
│   └── test_core.py
├── docs/
└── pyproject.toml

See also Module Responsibility Map for per-file responsibilities captured during the v0.4.2 refactor.

Testing

Run All Tests

# Run all tests
pytest tests/

# Run with coverage
pytest --cov=mcp_bigquery tests/

# Run specific test file
pytest tests/test_core.py -v

Test Categories

  1. Unit Tests - No BigQuery credentials required
    pytest tests/test_core.py
    

Writing Tests

# Example unit test
import pytest
from mcp_bigquery.server import validate_sql

@pytest.mark.asyncio
async def test_validate_simple_query():
    result = await validate_sql("SELECT 1")
    assert result["isValid"] is True

Code Style

Formatting

# Format with black
black src/ tests/

# Check with ruff
ruff check src/ tests/

# Type checking with mypy
mypy src/

Style Guidelines

  1. Follow PEP 8
  2. Use type hints for all functions
  3. Add docstrings to public functions
  4. Keep functions small and focused
  5. Use descriptive variable names

Making Changes

1. Create Feature Branch

git checkout -b feature/your-feature-name

2. Make Changes

Follow the existing code patterns:

async def your_new_function(params: dict) -> dict:
    """
    Brief description of function.

    Args:
        params: Dictionary with 'sql' and optional 'params'

    Returns:
        Dictionary with result or error
    """
    try:
        # Implementation
        return {"success": True}
    except Exception as e:
        return {"error": {"code": "ERROR_CODE", "message": str(e)}}

3. Test Your Changes

# Run tests
pytest tests/

# Test manually
python -m mcp_bigquery

4. Update Documentation

Update relevant documentation: - Add new features to README.md - Update usage and development docs as needed

5. Submit Pull Request

# Commit changes
git add .
git commit -m "feat: add new feature"

# Push to GitHub
git push origin feature/your-feature-name

Building and Publishing

Build Package

# Clean previous builds
rm -rf dist/ build/ *.egg-info

# Build distribution
python -m build

# Check package contents
tar -tzf dist/mcp-bigquery-*.tar.gz | head -20

Test Package Locally

# Install from local build
pip install dist/mcp-bigquery-*.whl

# Test installation
mcp-bigquery --version

Publish to PyPI

# Test on TestPyPI first
python -m twine upload --repository testpypi dist/*

# Publish to PyPI
python -m twine upload dist/*

Logging and Debugging

CLI Controls (v0.4.2)

python -m mcp_bigquery now delegates to logging_config so log levels are consistent across tools. Logs default to WARNING and stream to stderr.

mcp-bigquery --verbose          # INFO
mcp-bigquery -vv                # DEBUG
mcp-bigquery --quiet            # ERROR
mcp-bigquery --json-logs        # Structured JSON logs

These switches stack with the LOG_LEVEL environment variable or the config.log_level default resolved in mcp_bigquery.config.

Programmatic Setup

from mcp_bigquery.logging_config import setup_logging, resolve_log_level
from mcp_bigquery.config import get_config

config = get_config()
level = resolve_log_level(default_level=config.log_level, verbose=1, quiet=0)
setup_logging(level=level, format_json=True)

Common Issues

  1. Import errors

    # Ensure package is installed in editable mode
    pip install -e .
    

  2. Authentication errors

    # Check credentials
    gcloud auth application-default print-access-token
    

  3. Test failures

    # Run single test with verbose output
    pytest tests/test_min.py::test_name -vvs
    

Architecture Notes

MCP Server Implementation

The server follows MCP protocol standards:

  1. Tool Registration - Eight tools registered in handle_list_tools()
  2. Tool Execution - Requests handled in handle_call_tool()
  3. Error Handling - Consistent error format across all tools
  4. Async Support - All operations are async for performance

Core Modules

Client Factory (clients/factory.py)

  • Single place for constructing BigQuery clients with retry handling and ADC validation.
  • Respects BQ_PROJECT and BQ_LOCATION via config.get_config().
  • Client creation is accessed through mcp_bigquery.clients, which exposes compatibility wrappers around the shared factory.

Logging (logging_config.py)

  • Provides setup_logging() and resolve_log_level() used by the CLI and server during startup.
  • Routes logs to stderr by default, supports JSON formatting, and exposes a decorator for measuring performance of client creation.

SQL Analyzer (sql_analyzer.py)

  • SQLAnalyzer provides lightweight dependency extraction and syntax heuristics.
  • Designed for quick regex-based checks and dependency graphs.

Schema Explorer Package (schema_explorer/) - updated v0.4.2

  • datasets.py, tables.py, and describe.py split responsibilities for dataset listing, table aggregation, and schema formatting.
  • describe.py now owns shared serializers (timestamps, partitions, nested schema trees).
  • Modules rely on the client factory plus validators/exceptions and never import each other, preserving clean boundaries.

Error Handling

Standard error format:

{
    "error": {
        "code": "INVALID_SQL",
        "message": "Human-readable error",
        "location": {"line": 1, "column": 10},
        "details": []  # Optional
    }
}

Contributing Guidelines

  1. Open an issue first - Discuss major changes before implementing
  2. Follow existing patterns - Maintain consistency with current code
  3. Add tests - All new features need test coverage
  4. Update docs - Keep documentation in sync with code
  5. One feature per PR - Keep pull requests focused

Release Process

  1. Update version in pyproject.toml and src/mcp_bigquery/__init__.py
  2. Update CHANGELOG in README.md
  3. Create and push git tag
  4. Build and publish to PyPI
  5. Create GitHub release

Getting Help

License

MIT License - See LICENSE file for details