Case Study

Eliminating Documentation Debt Across 390 Repositories

How we reduced developer onboarding from weeks to days using automated documentation and AI-powered codebase intelligence

Client: UK property transaction platform

Industry:PropTech & Legal Solutions

Services:Architecture Advisory,AI-Augmented Development

ED

Key results at a glance

100%
Documentation coverage
up from 5%
Days
Developer onboarding
down from weeks
Minutes
Time to answer questions
down from hours
390
Repositories documented
automatically

The challenge

The scale of the problem

How do you document 390 code repositories when manual effort is not feasible? The UK's leading property transaction platform faced exactly this challenge, with virtually no technical documentation across their entire codebase.

The organisation had grown rapidly, acquiring multiple products and teams. This created significant problems that were impacting delivery speed and team effectiveness:

  • New developer onboarding took weeks - developers had to read code directly to understand system behaviour, often reaching out to multiple colleagues for context
  • Knowledge silos - only original authors understood how services interacted, creating single points of failure
  • Cross-team friction - teams working on different products couldn't easily understand dependencies or shared components
  • Architectural drift - without documented standards, implementations diverged across teams, making maintenance increasingly difficult

The business needed a solution that could scale across their entire codebase without requiring manual documentation effort from already-stretched development teams. Previous attempts at documentation drives had failed - the volume was simply too large, and documentation became outdated almost immediately.

The results

Key results

  • Documentation coverage increased from 5% to 100% of repositories
  • Time to answer "how does X work?" reduced from hours to minutes
  • New developer onboarding reduced from weeks to days
  • Documentation maintenance shifted from manual (rarely done) to automated (always current)

Measurable transformation

For the first time, the organisation has complete visibility across their entire codebase. The automated documentation engine runs on every commit, ensuring documentation never drifts from reality.

New team members can now explore the codebase through natural language queries, getting contextual answers that previously required tracking down senior developers. This has dramatically reduced the onboarding burden on existing team members.

The cross-team friction has largely dissolved - teams can now understand dependencies and shared components without lengthy meetings or Slack threads. Architecture decisions are visible, and the documented patterns serve as living standards.

The solution

AI-augmented documentation at scale

Rather than attempting to manually document 390 repositories, we designed an AI-augmented solution that would generate and maintain documentation automatically.

Automated diagram generation

Using our architecture advisory expertise, we implemented automated C4 model diagram generation directly from code analysis. This produced:

  • System context diagrams showing external integrations
  • Container diagrams for each service
  • Sequence diagrams for key business workflows
  • Consistent documentation across all 390 repositories

AI-powered codebase intelligence

We built a natural language interface that allows developers to ask questions about any part of the codebase and receive informed answers drawing on the entire 390-repository knowledge base. This eliminated the need to track down the right person with tribal knowledge.

Technical deep dive

Architecture decisions

  • Vector embeddings for semantic code search across repositories
  • Incremental parsing to handle the scale without full re-processing
  • Language-agnostic AST analysis supporting C#, TypeScript, and Python codebases

Technology stack

  • Azure OpenAI for embeddings and natural language understanding
  • PostgreSQL with pgvector for vector similarity search
  • GitHub Actions for CI/CD integration and automated updates
  • Mermaid and PlantUML for diagram generation

Integration patterns

The solution integrates with the client's existing toolchain - Slack for Q&A, Confluence for diagram publishing, and GitHub for documentation alongside code.

Ready to achieve similar results?

Let's discuss how we can help your organisation achieve these results.

Book a strategy call

Architecture Advisory

De-risk critical architecture decisions with on-demand senior advice. Get peer-level technical depth for complex systems, AI adoption strategies, and architectural reviews, without hiring a full-time architect.

Learn more →

AI-Augmented Development

Learn how to leverage AI effectively in your development process. Get proven AI-augmented practices for LLM integration and developer tooling, realistic guidance on when AI helps vs. hinders, and hands-on implementation support that fits your team's capability.

Learn more →