Sometimes the best decisions feel like the worst ideas at the time. Like betting your entire project timeline on a platform that just launched its first preview version. What could possibly go wrong? But here's what happened when we took that leap with Databricks Apps.
This is a story of a year-long journey building apps on Databricks—the good, the lessons learned, and the reasons why I chose it in the first place. And why I'd make the same decision again.
Phase I. MVP application in 30 days
This is a real application, with real users and real deadlines. The actual project and its data are confidential, so I’m going to talk about it metaphorically. Let’s say it’s an AI-assisted baseball recruitment platform.
The client approached us with a very specific request.
We need an AI application with the following requirements:
- Full-stack (so no Gradio or Streamlit would do the job)
- Custom UI perfectly aligned with the baseball player scouting workflows
- AI features: streaming AI chat and AI-generated reports
- Custom dashboards, data tables, D3.js charts and animations
- Authentication and everything else needed for a production-ready application
And all that had to be built in 30 days.
Sounds easy, right? So, how should we approach this task?

Choosing an architectural approach
Databricks and AWS
We started with the classic approach and went for a well-known cloud provider—AWS. Just like any other team would. It’s tried and tested and has been an industry standard for a decade. However, we quickly ran into obstacles:
- It took time to get approvals for some services, as it usually is with enterprise IT.
- Due to the sensitive nature of the data, we faced complexity in security, authorization, and overall data synchronization.
- Finally, we had to decide: either the Data+AI engineers had to build interfaces for the AWS integration, or software engineers needed skills across both Databricks and AWS.
All things taken into account, the architecture would work like this:

Just then, an email with a big promise dropped into my inbox. Databricks Apps—an integrated AI app development environment with serverless hosting, built-in security, authentication, and data governance. Deeply integrated into Databricks, it was close to our existing data, Unity Catalog, and AI gateways.
That was exactly what we needed at that moment. If only we had a time machine to have it in GA and ready for production.
Databricks only
I have to admit, it was a bit of a gamble. But we bet on Databricks Apps from day one, even though it was still in preview.
I'm not saying AWS wouldn't have worked. However, it was clear that with a Databricks-only architecture and built-in boilerplate features, we could reduce development time by at least two weeks—impressive given that we only had 30 days in the first place.

You know how we developers are when it comes to trying out new shiny tools. But of course, with every new tool comes new tool drama. So I braced myself for unstable deployments and random crashes without proper traces.
Yet, remarkably, Databricks Apps not only saved us late-night headaches. It went so well that it gave us extra time to add more features to the original scope.
Lessons learned
Was everything nice and smooth? Of course not.
For one, we struggled with the database. Lakebase wasn't available back then, so we had to rely on Delta tables for transactional data. One of the first things you learn when you study IT is to avoid this pattern. This led to slower transactions and connectivity issues.
Another problem was the lack of built-in scaling. So we had to design our architecture to cover this. Fortunately, Databricks provides powerful services for heavy computing and clusters below the application layer. In the end, optimizing our app to be as thin as possible paid off significantly.
For instance, someone accidentally deployed an ML model execution into the backend app. You can imagine the chaos it caused. No surprise Lambda is generating so much revenue.
Scaling is only good if you do it properly.
Since then, both challenges—OLTP and scaling—have been completely solved. I’m going to show you how.

Data architecture
This is the data architecture we used.

We synchronized operational and business data to Lakebase OLTP through synced tables, ensuring low latency and high concurrency. Persisting data into the Delta table was typically managed through background services or pipelines. Initially, we had similar methods for Vector Search Index as well, however, it currently handles 1000+ QPS.
Serving all our data through OLTP supported our scaling strategy. It can handle thousands of queries simultaneously, with dozens of open connections.
CI/CD to set up apps on Databricks in minutes
One of the most remarkable highlights for us was the Databricks CLI for Apps and overall CI/CD capabilities. If you have worked with AWS, Azure, or GCP, you would appreciate how fast and easy it is to set up an application from scratch on Databricks. You can do it in just a few minutes:
- Create a project
- Optionally set up IaC with Asset Bundles
- Upload your code
- Deploy
And voilà! Integrate it with GitHub Actions, and you will have a production-ready CI/CD pipeline in minutes. You can find samples in the GitHub Repository to see it in action.
Phase II. Application scale-up
Little did we know that the client’s ambitions went beyond an MVP app.
For the next phase, we had to build a complete self-service AI portal. So that baseball scouts could have everything done in one application without reaching out to other platforms. It should be able to compete with mature AI SaaS and:
- Provide ultra-low latency for high load (1000+ AI QPS)
- Handle large sets of documents simultaneously
- Manage file upload and sharing
- Have a custom set of prompts

At this point, we had to decide whether we had to migrate or not. Databricks Apps was ideal for building a custom app for an MVP. But would it work at scale?
Option 1. Migrate to Databricks and AWS
Back to where we started, we designed this beautiful architecture once again. So many boxes and arrows, it almost looked like a subway map.

Option 2. 100% Databricks architecture
Before moving on with this scenario, we talked to the team from Databricks. They suggested that we try to build the architecture on Databricks. It would scale entirely on Databricks and look like this.

Although it seemed promising, in the background, I spent a couple of nights preparing for a possible migration, just in case. But it wasn't necessary.
That said, in the process, I discovered another important advantage. The Databricks SDK is so well-designed for migration scenarios that we’re not locked into the Databricks Apps platform. We had complete freedom to stay or to move.
Migrate or stay on Databricks?
To make the right decision, we carefully listed the advantages and disadvantages of both approaches.
Benefits of migrating to AWS
- More customization
- Custom auth, SSO, and not only for enterprise users
- Multiple scalability options
Benefits of staying on Databricks
- Automatic scaling and resource sharing
- Built-in auth and security
- Managed infrastructure
- Uniform observability
- No need for an extra connector layer
- Option to migrate at any time
- Coordinated development between the AI and software teams
An important drawback in this scenario is that, up until now, the strengths of Databricks Apps mostly stand out when building enterprise-level internal applications.
But the most crucial question remained: Can we actually scale a 100% Databricks application?
Scaling options
Turns out we had several options for scaling.
I’m the kind of person who believes it when I see it. So, I tested our most critical tasks involving multiple AI gateway queries, read/write operations to OLTP databases, Delta tables, and a Vector Search Index. Mind that these are my own measurements and not official numbers.
- Single app instance: Databricks Apps can handle 70-100 QPS.
- Cross-app scaling: You can multiply performance by adding instances manually with a minimal 30 ms latency. It’s not automatic, but it’s easy to implement.
- Serving endpoints: This is the service we used the most. It provides unlimited endpoints, scaling automatically. Scaling capabilities resemble Lambda functions with very low latency (20 ms extra).
- Jobs: It provides a high 10-15 second latency. But you can install any package and run it on customizable clusters for heavy-duty jobs, like document conversion or image manipulation.
Build vs. buy dilemma
Did the client make the right call to build instead of buying? After all, most of the features we implemented in the Databricks custom apps were already available elsewhere.
Here are a few important aspects to consider before making this choice:
- No compromises. You can build around your data and formats. No need to transform and simplify your data just to be compatible with a ready-made product.
- Custom UX and workflows. Your software is built for you, not the other way around. So you don’t need to waste time adapting your internal workflows to fit an off-the-shelf product.
- Complete ownership. You own the agents, models, and prompts, keeping your IP secure and safe in an all-in-one environment.
Plus, you spare yourself growing license fees as you scale up. And if you already use Databricks for data and AI, you gain a lot by extending with the Databricks Apps platform.
Afterthought
Nothing is impossible with Databricks. We thought building a production-ready AI app in 30 days couldn't be done, but here we are.
So stop overthinking your architecture and start building. Get my GitHub repo with (nearly) everything I’ve just described for a head start. Because sometimes the best way to learn is to steal.
Or reach out to me directly if you want to skip the trial-and-error and get straight to building something that actually works.
Looking for a certified Databricks partner who can get you to production faster than your competition?
Contact the Hilfy team