Step-by-step guide to integrating Llama 3 and FastAPI with Rails 8. Includes Docker setup, Solid Queue background jobs, and production-ready code examples.
docker compose up
. The demo includes mock AI responses so you can test without a GPU or model installation.
main.py
:
.env
file:
http://localhost:8001
with GPU acceleration. You can test it by visiting http://localhost:8001/docs
to see the automatic API documentation.
The first request will be slower as the model loads into GPU memory. Subsequent requests will be much faster.
app/services/ai_service.rb
:
.env
file:
config/application.rb
:
docker-compose.yml
:
.env
file for your secrets: