Development Guide¶
Guide for contributors and developers working on MCP BigQuery.
Development Setup¶
Prerequisites¶
- Python 3.10+
- Git
- Google Cloud SDK with BigQuery API enabled
Clone and Install¶
# Clone repository
git clone https://github.com/caron14/mcp-bigquery.git
cd mcp-bigquery
# Install with development dependencies
pip install -e ".[dev]"
# Or using uv
uv pip install -e ".[dev]"
Environment Setup¶
# Set up Google Cloud authentication
gcloud auth application-default login
# Configure project
export BQ_PROJECT="your-test-project"
export BQ_LOCATION="US"
# Install pre-commit hooks
pre-commit install
# Run development server
python -m mcp_bigquery
Pre-commit Setup¶
This project uses pre-commit hooks to ensure code quality:
# Install pre-commit hooks (one-time setup)
pre-commit install
# Run all hooks manually
pre-commit run --all-files
# Update hook versions
pre-commit autoupdate
Configured hooks: - isort: Sorts Python imports - black: Formats Python code (line length: 100) - flake8: Checks Python code style - ruff: Fast Python linter - mypy: Type checking for Python
Project Structure¶
mcp-bigquery/
├── src/mcp_bigquery/
│ ├── __init__.py # Version + exports
│ ├── __main__.py # CLI entry point (logging flags added in v0.4.2)
│ ├── server.py # MCP server implementation
│ ├── config.py # Environment/config resolution
│ ├── logging_config.py # Central log formatting + level helpers
│ ├── cache.py # Simple BigQuery client cache
│ ├── clients/
│ │ ├── __init__.py
│ │ └── factory.py # Shared BigQuery client creation
│ ├── schema_explorer/
│ │ ├── __init__.py
│ │ ├── datasets.py # Dataset listing flows
│ │ ├── tables.py # Table metadata aggregation
│ │ └── describe.py # Schema inspection + shared formatters
│ ├── sql_analyzer.py # SQL analysis helpers
│ ├── validators.py # Input validation utilities
│ ├── exceptions.py # Custom exception types
│ └── constants.py # Shared constants/env defaults
├── tests/
│ ├── conftest.py
│ └── test_core.py
├── docs/
└── pyproject.toml
See also Module Responsibility Map for per-file responsibilities captured during the v0.4.2 refactor.
Testing¶
Run All Tests¶
# Run all tests
pytest tests/
# Run with coverage
pytest --cov=mcp_bigquery tests/
# Run specific test file
pytest tests/test_core.py -v
Test Categories¶
- Unit Tests - No BigQuery credentials required
Writing Tests¶
# Example unit test
import pytest
from mcp_bigquery.server import validate_sql
@pytest.mark.asyncio
async def test_validate_simple_query():
result = await validate_sql("SELECT 1")
assert result["isValid"] is True
Code Style¶
Formatting¶
# Format with black
black src/ tests/
# Check with ruff
ruff check src/ tests/
# Type checking with mypy
mypy src/
Style Guidelines¶
- Follow PEP 8
- Use type hints for all functions
- Add docstrings to public functions
- Keep functions small and focused
- Use descriptive variable names
Making Changes¶
1. Create Feature Branch¶
2. Make Changes¶
Follow the existing code patterns:
async def your_new_function(params: dict) -> dict:
"""
Brief description of function.
Args:
params: Dictionary with 'sql' and optional 'params'
Returns:
Dictionary with result or error
"""
try:
# Implementation
return {"success": True}
except Exception as e:
return {"error": {"code": "ERROR_CODE", "message": str(e)}}
3. Test Your Changes¶
4. Update Documentation¶
Update relevant documentation: - Add new features to README.md - Update usage and development docs as needed
5. Submit Pull Request¶
# Commit changes
git add .
git commit -m "feat: add new feature"
# Push to GitHub
git push origin feature/your-feature-name
Building and Publishing¶
Build Package¶
# Clean previous builds
rm -rf dist/ build/ *.egg-info
# Build distribution
python -m build
# Check package contents
tar -tzf dist/mcp-bigquery-*.tar.gz | head -20
Test Package Locally¶
# Install from local build
pip install dist/mcp-bigquery-*.whl
# Test installation
mcp-bigquery --version
Publish to PyPI¶
# Test on TestPyPI first
python -m twine upload --repository testpypi dist/*
# Publish to PyPI
python -m twine upload dist/*
Logging and Debugging¶
CLI Controls (v0.4.2)¶
python -m mcp_bigquery now delegates to logging_config so log levels are consistent across tools. Logs default to WARNING and stream to stderr.
mcp-bigquery --verbose # INFO
mcp-bigquery -vv # DEBUG
mcp-bigquery --quiet # ERROR
mcp-bigquery --json-logs # Structured JSON logs
These switches stack with the LOG_LEVEL environment variable or the config.log_level default resolved in mcp_bigquery.config.
Programmatic Setup¶
from mcp_bigquery.logging_config import setup_logging, resolve_log_level
from mcp_bigquery.config import get_config
config = get_config()
level = resolve_log_level(default_level=config.log_level, verbose=1, quiet=0)
setup_logging(level=level, format_json=True)
Common Issues¶
-
Import errors
-
Authentication errors
-
Test failures
Architecture Notes¶
MCP Server Implementation¶
The server follows MCP protocol standards:
- Tool Registration - Eight tools registered in
handle_list_tools() - Tool Execution - Requests handled in
handle_call_tool() - Error Handling - Consistent error format across all tools
- Async Support - All operations are async for performance
Core Modules¶
Client Factory (clients/factory.py)¶
- Single place for constructing BigQuery clients with retry handling and ADC validation.
- Respects
BQ_PROJECTandBQ_LOCATIONviaconfig.get_config(). - Client creation is accessed through
mcp_bigquery.clients, which exposes compatibility wrappers around the shared factory.
Logging (logging_config.py)¶
- Provides
setup_logging()andresolve_log_level()used by the CLI and server during startup. - Routes logs to stderr by default, supports JSON formatting, and exposes a decorator for measuring performance of client creation.
SQL Analyzer (sql_analyzer.py)¶
- SQLAnalyzer provides lightweight dependency extraction and syntax heuristics.
- Designed for quick regex-based checks and dependency graphs.
Schema Explorer Package (schema_explorer/) - updated v0.4.2¶
datasets.py,tables.py, anddescribe.pysplit responsibilities for dataset listing, table aggregation, and schema formatting.describe.pynow owns shared serializers (timestamps, partitions, nested schema trees).- Modules rely on the client factory plus
validators/exceptionsand never import each other, preserving clean boundaries.
Error Handling¶
Standard error format:
{
"error": {
"code": "INVALID_SQL",
"message": "Human-readable error",
"location": {"line": 1, "column": 10},
"details": [] # Optional
}
}
Contributing Guidelines¶
- Open an issue first - Discuss major changes before implementing
- Follow existing patterns - Maintain consistency with current code
- Add tests - All new features need test coverage
- Update docs - Keep documentation in sync with code
- One feature per PR - Keep pull requests focused
Release Process¶
- Update version in
pyproject.tomlandsrc/mcp_bigquery/__init__.py - Update CHANGELOG in README.md
- Create and push git tag
- Build and publish to PyPI
- Create GitHub release
Getting Help¶
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: This guide and API reference
License¶
MIT License - See LICENSE file for details