Language First: Why Content Generation Must Follow the Business's Language
The Disconnect We Didn't Notice
For months, our pipeline handled language the same way most AI tools do: it tried to figure it out from context. If the website was in Indonesian, the captions leaned Indonesian. If the Claude response happened to come back in English, the captions were in English. The result was technically functional but subtly inconsistent โ some posts in Bahasa Indonesia, some in English, occasionally a mix in the same caption.
For most software products, that's a minor inconvenience. For Indonesian SMBs, it's a credibility problem. An Indonesian cafรฉ posting captions in English to a local Indonesian audience signals something unintended: that the content wasn't made for them. The best-performing Indonesian Instagram accounts code-switch deliberately when they mean to โ they don't drift into English because the AI had an ambiguous inference.
The Signal Is Already in the Message
The right source for language detection turns out to be the WhatsApp command that starts the onboarding process.
Jiwa AI onboards businesses through WhatsApp โ a deliberate choice for the Southeast Asian market where WhatsApp is the primary business communication channel. When a business owner wants to create their account, they send a command to our number. There are two options: /daftar (the Indonesian word for "register") and /onboard (English). That choice isn't incidental. A business owner who types /daftar is telling us something about their language context โ they read the Indonesian instructions, they're comfortable in Bahasa, and they expect to work in Indonesian. /onboard signals the reverse.
We were already capturing this signal and storing it as the business's preferred language. What we weren't doing was passing it through every downstream step that generates content. The language lived in the database, but it wasn't being handed to the calendar planner, the caption generator prompts, or the theme descriptions that inform the image generation pipeline.
Threading Language Through the Pipeline
The fix required changes at multiple layers. The most visible was the content calendar planner โ the step where Claude generates the themes, scenes, and content angle descriptions for each post in the two-week calendar.
These theme descriptions are consequential. They're not just labels. They become the creative brief for each post: the specific scenario the influencer will be depicted in, the mood, the setting, the context that the image prompt will render. A theme written in English ("Morning padel session with protein bar recovery") is technically accurate but won't flow naturally into Indonesian captions. A theme written in Indonesian ("Sesi padel pagi hari dengan pemulihan protein bar") keeps the creative intent consistent across the entire chain.
We now pass the business's language setting into the calendar planner as an explicit instruction. When the language is Indonesian, the planner generates all themes, descriptions, and content angles in Bahasa Indonesia. English onboards get English themes. This small change has an outsized effect because the themes set the tone for everything downstream: the caption writer uses them as context, and captions written in the same language as the theme are more coherent.
Why This Can't Live in One Place
The tempting implementation is to translate everything at the end โ generate in English, then translate to Indonesian before delivery. We explicitly chose not to do this.
Translation adds latency, cost, and a failure mode that's hard to catch. But more importantly, it produces content that reads like translation โ technically correct but missing the idiomatic quality that makes Indonesian social content actually land. Indonesian Instagram captions have distinct stylistic patterns: the rhythm of the language, the use of particles, the code-switching with English loan words in specific product categories. These patterns emerge naturally when Claude writes in Indonesian from the start. They don't emerge from translation.
The same logic applies to language detection. Inferring language from website content is noisy โ a bilingual website, a brand name in English on an Indonesian business, a product category with dominant English terminology can all confuse inference. The WhatsApp command is a much cleaner signal precisely because it's a human decision made in an intentional context.
Consistency as a Form of Respect
There's a philosophy underlying this change that goes beyond technical correctness. Content generated in the wrong language isn't just an inconvenience โ it's a signal that the system didn't really understand who it was serving.
Jiwa AI is built for Indonesian small businesses operating in a specific cultural and linguistic context. The post content, the caption tone, the hashtag strategy, the influencer scenes โ all of it should feel like it was made for the audience that will actually see it. That starts with generating in the right language from the moment the creative brief is written.
When a business owner sends /daftar and two minutes later receives a complete content calendar with captions, themes, and hashtags โ all in fluent, idiomatic Bahasa Indonesia โ it feels like the system understood them. That trust is worth getting right.
We're continuing to invest in language fidelity across the pipeline. The current work ensures the correct language is used for calendar themes and captions. Future improvements will extend this into hashtag strategy (trending Indonesian hashtags versus global English ones) and seasonal context (Indonesian cultural moments versus generic Western content calendars). The foundation is in place. The language-first principle scales from there.