datavaluepeople logo

Self-hosted LLMs: running your own inference infrastructure

Written by Daniel Burkhardt Cerigo

When does it make sense to run your own LLM inference infrastructure instead of paying per-token to third-party APIs like OpenAI or Anthropic? And how do you execute it once you’ve decided to?

I’m giving a talk and running a half-day hands-on workshop on the topic.

I’ll be updating this post as I do the talk and workshop with content from both. I’ll also write a concise summary post here once both are done.

The talk

Data Science Festival Big Birthday Bash 2026, 16th May 2026, London

Slides

Recording: coming soon…

The workshop

AI in Production 2026, 4-5th June 2026, Newcastle Upon Tyne

A hands-on afternoon workshop covering the decision framework for Third-party vs Self-host, applying it in some worked example LLM applications, and then getting hands-on with a deployment of an inference server using current leading open-source technologies.

Slides: see the talk slides above for now - I’ll upload further workshop resources here as they go live.

I don’t know if the workshop will be recorded and made publicly available, but I’ll put a link here if it is.

If you’re thinking about self-hosting, or just starting to grapple with leveraging AI internally in your org, drop me an email and I’d be happy to talk!

Daniel Burkhardt Cerigo

Written by Daniel Burkhardt Cerigo

May 14, 2026

datavaluepeople is a group of artificial intelligence experts. Through applied machine learning, building automated systems, advising, and education, we create value for businesses, organizations, and humans. Drop us an email to speak to us about how we could work with your organisation, or if you are interested in joining our team.

linkedIn icongithub icon
Continue reading