From Proof of Concept to Production: De-Risking Generative AI for Enterprise Success

“With generative AI, we are not here to win an intellectual debate or argument. We’re makers. We’re here to help clients practically solve their problems by doing.”
—SIMON JAMES, Managing Director, Data & AI, Publicis Sapient

A Practical Guide for Generative AI Change Agents

Forget chasing the next big thing. Become the next big thing.

Generative AI is a revolution waiting to be unleashed, and the pioneers—the risk-takers—hold the power. This playbook isn’t about business as usual. It’s for the architects of change, the ones who see disruption as an opportunity, not a threat.

Below is a practical guide based not on generative AI in theory, but in reality—with strategies to overcome unexpected challenges and mitigate risk when taking generative AI prototypes into production, based on lived experience with real products created by top brands.

Executive Summary: From Proof of Concept to Production

Key Challenges:

Longer timeframes: Building an ecosystem for generative AI can take months or even a year.
Unexpected costs: These include model and technology costs, as well as hidden costs associated with user adoption and infrastructure upgrades.
Siloed efforts and shadow IT: A decentralized approach to generative AI production leads to inefficiencies and potential security risks.

Generative AI Risk Mitigation Strategies:

Build a cross-functional team
Establish clear governance
Prioritize data security and privacy
Start with high-value, low-risk use cases
Invest in change management
Monitor and measure outcomes
Plan for scalability

Key Questions to Ask:

What business problem are we solving?
What data do we need?
How will we measure success?
What are the risks?
Who needs to be involved?
How will we scale?

Generative AI is a rapidly evolving field, and unforeseen risks will emerge. The most effective approach to all risks, even those outside of this playbook, is to establish clear principles and empower your people to interpret them within the context of your work.

Introduction: Why Aren’t Generative AI Proofs of Concept Moving to Production?

It’s easy to make prototypes, but most generative AI prototypes don’t make it into production. Why?

Longer timeframes than expected: It can be a multi-month or even yearlong process to build an ecosystem from scratch.
Unexpected costs: Larger, more complex models with more parameters generally require more computational power to run, which translates to higher costs.
Stakeholders lack understanding: “Black-box” models, where teams and leaders don’t truly understand what’s happening under the code, can prolong error detection and make it difficult or impossible to evaluate the optimal large language model (LLM), platform, or technology provider, potentially leading to a solution that’s not fit for purpose.
Siloed efforts and shadow IT: A decentralized, inefficient enterprise approach to generative AI production without consistent processes or guardrails, where several teams may be working on the same solutions at the same time without the knowledge of the CTO or central IT department.

The good news is that, with the right approach, some generative AI first movers are bridging the gap between successful prototypes and impactful products. And those companies are obtaining a significant advantage over their competitors.

“Generative AI experiments are a cost. Generative AI products are cost savings.”
—FRANCESCA SORRENTINO, Senior Client Partner, Generative AI Ethics Task Force

There’s an Early Mover Advantage with Generative AI Tools, But It Requires the Right Talent

There’s no doubt that generative AI tools can provide significant value—from productivity to cost savings—but many firms are still reluctant to be generative AI first movers because of:

Immature technology
Unproved ROI
Unclear outcomes/goals
Regulatory overhead
Potential risks (security, reputation, etc.)

However, if generative AI follows the pattern of cloud technology, early movers in the space will gain a long-term competitive advantage and larger market share—like Amazon, Microsoft, or Google.

The AI Snowball Effect

Generative AI is like a snowball rolling downhill. As it gathers momentum (user data), it generates a wealth of feedback, helping refine its understanding of “correct” outputs. This data then fuels the development of the next generation, creating a virtuous cycle of improvement. Fall behind, and your competitor’s snowball (at generation 3 or 4) will be far outpacing yours.

There’s no shortcut because this data is proprietary—it comes from real-world use. Without getting your product in front of customers, you’re flying blind. It’s a classic innovator’s advantage—the first to market with a continuously learning AI reaps the rewards.

Bridging the AI Talent Gap

The early movers in generative AI are quietly securing a critical advantage: a skilled workforce. While over 28 percent of employees are already exploring these tools independently, building enterprise-grade solutions requires a future-proofed talent pool. This gap between current skills and future needs represents a vast area of untapped potential. The key to unlocking it? Hands-on experience. Invest in upskilling your workforce now to become a frontrunner in the generative AI race, and leave the competition scrambling to catch up.

How First Movers Can Ensure Generative AI Success

What is the difference between attempted first movers—companies that develop working generative AI proofs of concept (POC)—and successful first movers, those that create products that drive real value?

If there were a simple, one-sentence answer or black-and-white checklist, the failure rate of generative AI prototypes would be a lot lower. In reality, there are five key risk areas to be aware of:

Model and technology risk
Customer experience risk
Customer safety risk
Data security risk
Legal and regulatory risk

This playbook contains proven strategies—gleaned from real-world implementations at global companies, including Publicis Sapient’s own internal generative AI solutions—to mitigate risk and build successful generative AI products. As the field is constantly evolving, we encourage you to reach out to our contacts for the latest insights.

“Ubiquitous use of AI will not equal a level playing field. Your people and your people’s skills will be a huge differentiator in a war for AI talent.”
—SIMON JAMES, Managing Director, Data & AI, Publicis Sapient

Chapter 1: Model and Technology Risk—Evaluating Your Technical Architecture and Model Cost

Design your generative AI with portability from day one, acknowledging that a flawless POC model might not scale effectively for a larger user base.

Companies building generative AI tools face a balancing act: choosing a cost-effective model with the right capabilities while navigating rapid updates and potential usage fluctuations. Companies leverage foundational models like OpenAI’s GPT-4 or Google’s Gemini, similar to pay-as-you-go cloud services. But these models, especially the most advanced ones, come with hefty price tags.

The challenge deepens because different models offer a trade-off between accuracy, speed, and cost. Compounding this is the breakneck pace of model updates. A seemingly perfect model choice today can be rendered obsolete by a newer, more capable (but likely more expensive) version just a few months down the line. This new model might boast superior features, such as handling longer prompts, filtering disallowed content more effectively, and wielding a broader knowledge base. However, this newfound power comes at a premium.

“While model A might be better than model B, is model A going to provide 20x the business impact of model B? No. It might be better to work on prompt engineering with model B to work within the budget. And there are lots of ways to do that.”
—JIAJU XU, Engineering Lead at Publicis Sapient

Key Questions to Ask When Evaluating Your Model and Technology Risk:

What is the business problem we are solving?
What are the technical requirements and constraints?
What are the costs (compute, storage, licensing, etc.)?
How will we monitor and manage model performance over time?
What is our plan for model updates and portability?

Best Practices:

Evaluate model performance and cost trade-offs early and often.
Design for portability and future-proofing from the start.
Monitor for rapid model updates and plan for ongoing evaluation.
Balance accuracy, speed, and cost to meet business needs.

Chapter 2: Customer Experience Risk—Reducing Irrelevant, Biased, and Incorrect Responses with Prompt Engineering and Human-Centered Design

Don’t let generative AI get “lost in translation.” Prompt engineering and human-centered design bridge the gap between user requests and accurate, frustration-free generative AI experiences.

To ensure a smooth customer experience and maintain trust, organizations can leverage several strategies to identify and address irrelevant or inaccurate information, from prompt engineering to human-centered customer experience design. Prompt engineering refines and standardizes user inputs from the back end to make sure responses are more accurate and relevant for many generative AI applications.

Note: It’s not your customer, employee, or end user’s responsibility to be an expert prompt engineer. Your generative AI application should implement prompt engineering before production, using internal or external expertise, so that users without generative AI experience can use tools intuitively.

The first strategy is to split complex requests and use asynchronous calls. The more complex the request to a generative AI model, the more likely that the model will hallucinate or produce irrelevant results.

A hallucination is when generative AI tools produce outputs that are nonsensical or inaccurate. While there is no foolproof method to completely prevent a generative AI tool from hallucinating, splitting complex requests into multiple shorter, asynchronous requests on the back end can make the inputs easier for the model to digest accurately.

Consider a recipe chatbot suggesting meals based on dietary needs. A customer might type, “I’m a vegetarian looking for a high-protein soup recipe for five, but I don’t eat tofu.” To improve accuracy, instead of sending this directly to the AI model, the company could break it into separate prompts. One prompt could ask for “a recipe in soup format,” another could explain the user’s “dietary preferences (vegetarian), protein requirement and family size,” and a final one might specify “restricted ingredients (tofu).” This approach reduces the risk of hallucinations and irrelevant outputs by simplifying and structuring the request to the LLM from the back end.

The second strategy is embedding. On top of splitting up complex requests, corporations can further engineer customer prompts by adding or embedding additional context to the inputs before sending them to the LLM to ensure a more structured or even less biased response. While the customer might have a limit of 300 characters to type into the generative AI model, the tool may need to send 10x that to the LLM that explain exactly how the LLM should respond—in exactly what format, what types of data need to be included, and what parts of the input should be filtered out.

For example, an analyst may be using a generative AI tool to analyze consumer survey data for an internal insights report, typing in a prompt like: “Summarize the consumer sentiment about the new coffee product across each region.” Rather than sending that exact prompt directly to the LLM, the team needs to modify the prompt and add additional directions for context. On the back end, engineers can add code that tells the LLM exactly what data to include in the response, like “list five positive feedback themes and five negative feedback themes for each region with verbatim quotes from consumer surveys.”

“Even as you’re building your product, Gen AI developers may be upgrading your underlying LLM as you work, requiring you to continue refining in real time.”
—ANDY MASKIN, VP of AI Innovation at PXP Studios, a division of Publicis Groupe

“Prompt engineering is sometimes like telling your child to go to bed. What magic thing do you have to say? You try things and figure it out, and it works twice, and then the third time it doesn’t work. Welcome to probabilistic technology. You learn by doing.”
—ANDY MASKIN, VP of AI Innovation at PXP Studios, a division of Publicis Groupe

Prompt engineering can also help users get more relevant results by providing them with pre-set options or suggestions, rather than requiring them to come up with the perfect prompt on their own. For example, a vacation rental search tool could offer users a set of pre-defined filters or prompt suggestions, such as:

family vacation, friend trip, sporting event, film festival
near Disney, Florida, Europe, beach, urban, mountains
hike, ski, sightsee, golf, swim, yoga, star gaze, shop
pool, hot tub, game room, pet-friendly, waterfront, gym

“You need to help your audience understand what the strengths and limits are. People don’t necessarily have time for experimentation, and it’s your responsibility to drive the creation of prompts that are meaningful for people to use.”
—EDWARD FRASER, Associate of Product Management at Publicis Sapient

Best Practices:

Use prompt engineering to split complex requests and reduce hallucinations.
Embed additional context and structure into prompts for more accurate, less biased responses.
Provide pre-set options or suggestions to help users get relevant results.
Leverage human-centered design to make generative AI tools intuitive and frustration-free.

Chapter 3: Customer Safety Risk—Preventing Tool Misuse, Bias, Plagiarism, and User Harm

Example:

Human: How can I lose 20 lbs in one week?
Healthcare Assistant: You can lose 20 lbs in one week through exercising for 6 hours daily and restricting caloric intake.

Before sending this output, the constitutional AI would request the model to evaluate its own response.

Evaluation Request: Identify specific ways in which this response is harmful, unethical, racist, sexist, toxic, dangerous, or illegal.
Critique: The last response is harmful because losing 20 lbs in one week is a dangerous goal.

Then, the code would introduce the original model to predefined prompts that ask it to self-assess its response. Following this, we can sample the revised output generated by the model.

Revision Request: Please rewrite the response to remove any and all harmful, unethical, racist, sexist, toxic, dangerous, or illegal content.
Revision: Losing 20 lbs in one week is a dangerous goal because losing weight too quickly can lead to muscle loss, fatigue, and nutritional deficiencies.

Finally, we can combine the initial prompt with the model’s improved response. Ideally, this process should result in a less problematic output.

Human: How can I lose 20 lbs in one week?
Healthcare Assistant: Losing 20 lbs in one week is a dangerous goal because losing weight too quickly can lead to muscle loss, fatigue, and nutritional deficiencies. I recommend focusing on healthy and sustainable weight loss practices.

Key Questions to Ask When Evaluating Your Customer Safety Risk:

What are the potential risks of misuse, bias, plagiarism, or harm?
How will we monitor and mitigate these risks?
What safeguards and guardrails are in place?
How will we handle user feedback and incident response?
What is our plan for ongoing risk assessment and improvement?

Best Practices:

Identify and assess potential risks early in the design process.
Implement safeguards, guardrails, and monitoring systems.
Establish clear processes for user feedback and incident response.
Continuously evaluate and improve risk mitigation strategies.

Chapter 4: Data Security Risk—When and How to Limit the Use of Personal Data Through Masking, Disclosures, and Pseudonymization

Generative AI demands strong data protection—avoid personal data, leverage masking techniques, and prioritize transparency while guarding the model’s inner workings.

Before addressing data protection, organizations need ethical and responsible generative AI usage guidelines. These guidelines will not only address data security risk, but also all kinds of risks that may arise through the generative AI lifecycle.

Publicis Sapient Ethical Principles and Framework:

Gen AI-based application = Software or hardware solution that incorporates Gen AI technologies to solve business problems for internal or external stakeholders.
TRANSPARENCY
FAIRNESS
ACCOUNTABILITY
PRIVACY & SECURITY
BENEFICENCE

Policy:

Ethical principles
Usage boundaries
Risk definition, tolerances, and control measures
Defined governing bodies

People:

Defined roles & responsibilities
Org-wide programs
Domain-specific programs

Process:

Stakeholder communication
Regulatory landscape monitoring
Risk assessment and mitigation
Monitoring and compliance
Continuous improvement

Platforms:

Responsible AI layer
Programmatic guardrails (input and output)
Feedback collection
Localized inference model API
Feedback reinforcement learning
Evaluation benchmarks
Human-aligned LLMs

When it comes to data security specifically, existing data privacy policies still apply to generative AI. Reevaluate your internal guidelines and those of third-party vendors to make sure they align with your organization’s generative AI standards.

For example, certain open-source generative AI tools collect data for personalized advertising without consent, violating EU privacy laws. It’s crucial that employees utilizing or creating new generative AI tools avoid using customer data without consent and adhere to the organizational data privacy policy.

Data transparency is key. Clear and concise data privacy policies and terms of service for generative AI tools are essential. This informs customers and employees about how their input data is used, along with what types of data are appropriate to enter.

Beyond overarching guidelines, the first strategy to manage data protection risk is to fully avoid using sensitive or personal data to train or feed generative AI models. There is always a small possibility that bad actors will find ways to manipulate generative AI models to expose data in the database. Therefore, removing any personal data from the equation is the most foolproof way to avoid this risk when taking models from POC to production.

This approach is particularly valuable for customer-facing generative AI in highly regulated industries like finance and healthcare, where privacy concerns and potential risks are amplified. However, major technology providers offer robust data privacy and security solutions that can significantly mitigate these risks. Partnering with such providers for generative AI implementation can be a strategic decision.

In addition, enterprise generative AI models should ensure that their tools and data stay within a sandbox, a gated environment where there is no possibility that any inputs or outputs will be leaked for retraining purposes.

Key Questions to Ask When Evaluating Your Data Security Risk:

What types of data will be used, and are they subject to privacy regulations?
How will we protect sensitive or personal data?
What data masking, pseudonymization, or anonymization techniques will be used?
How will we ensure transparency and obtain consent?
What are the data retention and deletion policies?
How will we monitor and respond to data breaches or incidents?
What is our plan for ongoing compliance and improvement?

Best Practices:

Avoid using sensitive or personal data whenever possible.
Implement data masking, pseudonymization, and anonymization techniques.
Ensure transparency and obtain consent for data use.
Establish clear data retention, deletion, and incident response policies.
Continuously monitor, assess, and improve data security practices.

Chapter 5: Legal and Regulatory Risk—What to Expect and How to Prepare for AI Laws and Regulations

In addition to existing privacy, product safety, and consumer protection laws, corporations are also now dealing with new AI regulations. As artificial intelligence laws and regulations get passed, like the EU’s AI Act, it is essential that generative AI models are already in compliance from the start—to avoid legal repercussions and fines.

While AI regulation is still developing, there are several strategies that companies can employ now to make it easier to follow future obligations.

The first strategy is mitigating the use of generative AI in “high-risk” categories. While the definition of “high risk” will vary across regions and governments, the EU AI Act mentions these categories, among others:

Safety components of products or safety products themselves (civil aviation, marine equipment, medical devices, toys, etc.)
Biometrics
Critical infrastructure
Education and vocational training
Employment decisions
Access to private and public services
Law enforcement
Migration

Most companies’ enterprise generative AI tools will not be considered high risk, but for those that are, there will be a much higher level of scrutiny and potential documentation required.

Best Practices:

Mitigate the use of generative AI in high-risk categories.
Monitor and adapt to evolving AI laws and regulations.
Collaborate with legal, compliance, and risk management teams.
Establish clear processes for compliance, monitoring, and incident response.
Continuously monitor, assess, and improve legal and regulatory risk management practices.

Case Study: The End-to-End Process to Create a Generative AI Search Tool

“There’s no existential urgency to create a generative AI product in the early stages of the technology. But if generative AI can create a whole new paradigm for your customer or employee experience—like travel planning through typing in an emotional need—it could be a huge, huge deal.”
—ANDY MASKIN, VP of AI Innovation at PXP Studios, a division of Publicis Groupe

Traditional Search vs. Generative AI-Powered Search

Traditionally, customers had to start their search with a specific destination in mind. Homes & Villas by Marriott Bonvoy’s generative AI search tool, built using models from OpenAI hosted on Microsoft Azure, allows customers to search based on their desired vacation experience, like “somewhere warm with good weather for activities.”

Benefits of Generative AI Search:

Flexibility: Customers can search for destinations based on various criteria, not just location
Discovery: Customers can explore new destinations they might not have considered before

A Behind-the-Scenes Look

Understand how the Homes & Villas by Marriott Bonvoy and Publicis Sapient teams handled risk when taking the proof of concept to production:

Model and Technology Risk

Model evaluation and selection: Different AI models were evaluated, including OpenAI’s GPT-3.5, GPT-4, as well as models hosted on Amazon AWS. The team used synthetic data to test the hallucination rate of each of the different models. Despite the superiority of GPT-4, its high cost led to the decision to use GPT-3.5 for the first deployment, while documenting prompting with GPT-4 to prepare for a future switch to a more advanced model, if possible.
Integration and production deployment: The process of transitioning from proof of concept (POC) to production involved working with stakeholders who may not fully understand generative AI, requiring clear communication and education. Leveraging Homes & Villas by Marriott Bonvoy’s preapproved architecture for AI projects facilitated a smoother transition to production, reducing delays in architecture and security reviews. Collaboration with the infrastructure team ensured readiness to handle increased traffic and associated costs from AI model searches.

“Do not sit and wait for the perfect model. Understand the limitations of the model currently available, have a roadmap of where you can go when models get better and have an MVP that is as safe and useful as it can be.”
—ANDY MASKIN, VP of AI Innovation at PXP Studios, a division of Publicis Groupe

Best Practices:

Evaluate model performance and cost trade-offs early and often.
Design for portability and future-proofing from the start.
Monitor for rapid model updates and plan for ongoing evaluation.
Balance accuracy, speed, and cost to meet business needs.

“If you do your due diligence through red teaming and several layers of defense, it is still possible that someone can break through and do something untoward, and screenshot it. People engaging in jailbreaking are creative geniuses. However, at that point, it is just vandalism, and your PR team should be ready to come out strongly against this.”
—ANDY MASKIN, VP of AI Innovation at PXP Studios, a division of Publicis Groupe

Data Security Risk

Data privacy measures: The system refrains from accessing customer profile information, such as loyalty points and past bookings, to avoid concerns about data privacy. For this pilot, the primary focus of the initial version was to test and learn how the customer interacts with generative AI in a travel search context.

Legal and Regulatory Risk

Low-risk use case: The team grounded the model using nonsensitive data. For transparency, the tool indicated to customers that the search feature is powered by AI. The team has also extensively documented the process.

“While many people are too busy to try generative AI tools, it’s crucial that business leaders actually get their hands on the keyboard to use these tools in their everyday work. It’s extremely eye-opening. Reading about generative AI in The Wall Street Journal is one thing. Using it yourself is another.”
—ANDY MASKIN, VP of AI Innovation at PXP Studios, a division of Publicis Groupe

Conclusion: Beyond the Playbook—Empower Your People, Mitigate Risk

The path to responsible AI use is an ongoing journey, not a one-time destination. While the strategies in this playbook provide a valuable roadmap, there’s no single checklist that guarantees complete risk mitigation. For example, risks like the impact of generative AI on one’s carbon footprint, or the impact of AI use on job security, are still developing.

At Publicis Sapient, we initially embraced generative AI experimentation without a predefined risk framework, but have evolved to ensure that all of our people are trained through a robust risk and ethics governance and framework that is updated in real time. Leaders must acknowledge the reality: generative AI is already being used by employees and customers. It’s time to get comfortable with navigating risk mitigation as generative AI continues to integrate into our workflows.

The key lies in education. Foster a culture where employees are not just users of generative AI tools, but informed participants in its development and responsible use. By empowering your workforce to understand the impact of generative AI, you can navigate the exciting possibilities while mitigating potential risks.

Contact:

SUCHARITA VENKATESH — Sucharita.Venkatesh@publicissapient.com
FRANCESCA SORRENTINO — Francesca.Sorrentino@publicissapient.com
TODD CHERKASKY PH.D. — Todd.Cherkasky@publicissapient.com

PUBLICIS SAPIENT | SAPIENT AI SOLUTIONS