JiwaAI
Blog
โ†All posts
architecture
cost-optimization
ai
content-generation
prompt-engineering

Ten Pipeline Fixes, Zero Extra API Cost

Jiwa AI Teamยท

The Critique That Changed Everything

We ran a brutal self-audit on our content generation pipeline. Not the kind where you pat yourself on the back for what works โ€” the kind where you look at every stage and ask: where are we leaving quality on the table?

The answer was uncomfortable. We had built a surprisingly capable system โ€” multimodal product analysis, DNA-based evaluation, five-strategy image generation with fallback chains โ€” but the pieces weren't talking to each other. Product components were identified during onboarding and then forgotten. Positioning data was computed and thrown away. Caption prompts had no idea what a product should never be called. The system knew more than it was using.

The Constraint: Fix Everything, Add Nothing

The rule was simple. Every improvement had to work within the existing cost envelope of roughly thirty-seven cents per onboarding. No new API calls, no additional image generations, no extra AI evaluations. The only lever we could pull was making existing calls smarter.

This constraint forced a specific kind of thinking. Instead of asking "what new capability should we add?" we asked "what information are we already computing but not threading through?" The answer turned out to be: a lot.

Products Are Not Just Names

The biggest conceptual shift was how we think about products. Previously, a product was a name, a description, and a highlight angle. That's how a database thinks about a product. It's not how an influencer thinks about one.

An influencer holding a skincare set doesn't interact with "Glow Kit Premium." They squeeze the dropper, peel open the sheet mask, scoop moisturizer from the ceramic jar. Each component has different visual appeal โ€” the iridescent dropper bottle is far more photogenic than the cardboard box it came in. Each component demands a different camera angle, a different interaction, a different post type.

We restructured product analysis to capture this. Every component gets a saliency score โ€” how instagrammable is it? โ€” along with recommended camera angles, natural interaction descriptions, and which post types suit it best. The analysis still happens in the same single Haiku Vision call during onboarding. We just ask for more structured output.

The payoff comes downstream. When the content calendar is generated, each post is assigned a specific component. Instead of six posts that all show "someone holding the product," you get a close-up of the dropper, a flat lay of the full set, a medium shot of the mask being peeled, a tutorial-style carousel about the moisturizer. Same number of images, dramatically more variety.

Telling the AI What Not to Do

One of the subtler problems in AI content generation is forced positioning. Ask an AI to write about a sourdough biscuit for a fitness influencer, and it will cheerfully describe it as "the perfect post-workout recovery fuel." It sounds great. It's also completely dishonest.

We now run a product positioning analysis that produces explicit forbidden framings. A sourdough biscuit gets tagged with "never position as: workout fuel, recovery drink, health supplement." A protein bar gets "never position as: dessert, candy, junk food." These guards flow all the way into the caption generation prompt as hard constraints.

This required zero additional AI calls. The positioning analysis runs inside an existing brand analysis step. The guard data is persisted to the database and injected as text into prompts we were already making. The AI just receives better instructions.

Fixing What We Already Had

Some improvements were embarrassingly simple. Our image generation was producing 1024-pixel squares โ€” Instagram's native resolution is 1080. A two-line fix. Our face consistency model was using a generic image adapter instead of a face-specific one. A one-line fix. Our website scraper was truncating brand content at four thousand characters, cutting off product descriptions that appeared further down the page. A one-number change.

These aren't glamorous. But stacked together, they meaningfully improve output quality at exactly zero cost increase.

The Hashtag Problem Nobody Mentions

AI-generated hashtags sound reasonable but perform terribly. They're grammatically plausible but disconnected from actual Instagram discovery patterns. A generated tag like "HealthySnacking" has a fraction of the reach of the established Indonesian-market tag "MakanEnak."

We built a curated database of sixty-plus Indonesian hashtags across seven industry categories, each tagged by volume tier โ€” high, mid, or niche. After the AI generates its hashtags, a zero-cost post-processor blends in two to three high-volume curated tags, keeps the best AI suggestions as mid-tier, and adds a couple of niche tags for specificity. The result is an eight-to-twelve tag set that balances discoverability with relevance.

Building the Feedback Loop

The most important change isn't something users will notice today. We added engagement tracking infrastructure โ€” a model for recording likes, comments, saves, and reach after posts are published, and the math to correlate predicted quality scores with actual performance.

Right now, our quality scoring is Claude evaluating Claude's output. Self-assessment with no ground truth. As engagement data accumulates, we can start calibrating: what does a quality score of seventy actually mean in engagement terms? Which caption style โ€” emotional storytelling or informational tips โ€” performs better for food brands versus beauty brands?

This infrastructure added one database model and one API endpoint. No AI calls. But it's the foundation that turns a static content generator into a system that learns.

What It Costs

The complete per-onboarding cost breakdown after all ten improvements:

About four cents for AI text intelligence โ€” brand analysis, theme extraction, mood board, calendar planning, caption generation with A/B variants, and quality scoring. Roughly thirty-one cents for image generation โ€” face-consistent influencer photos, product composites with adaptive shadows, and carousel slides. Another two and a half cents for multi-layer quality evaluation.

Total: approximately thirty-seven cents per onboarding. The same as before. Every improvement was achieved by making existing calls carry more information, not by adding new ones.

The most expensive thing in an AI pipeline isn't the API calls you make. It's the intelligence you compute and then throw away.