Only 10 percent of organizations have formal AI policies in place, highlighting the urgent need for clear employee guidelines on AI data protection and security. Organizations should prioritize data protection by avoiding the use of confidential information in AI systems when possible, or implementing security controls like pseudonymization when sensitive data is necessary. Potential risks can be mitigated by establishing guidelines for the responsible use of AI, employing progressive disclosure to provide users with some transparency without revealing sensitive information and partnering with robust technology providers who can work with you to create secure solutions. Maintaining AI data security requires ongoing efforts including regular policy updates, employee education, continuous monitoring and stakeholder engagement.
In the rapidly evolving landscape of AI, data protection has become a paramount concern.
The FTC has made it clear: Model-as-a-service companies must honor their privacy commitments and refrain from using customer data for undisclosed purposes or face serious consequences, including the deletion of unlawfully obtained data and models. For enterprises leveraging AI tools, particularly generative AI built on large language models (LLMs) or extensive internal datasets, the stakes are high. A data breach could expose confidential customer information, leading to significant liability. But the risk doesn't stop there. Employees or customers may inadvertently input confidential company data or other private information into these generative AI tools. Without robust safeguards, this data could be exposed, putting the enterprise at risk of legal repercussions and damaging its reputation. Additionally, in the United States, it is considered unfair or deceptive for a company to adopt more permissive data practices, such as sharing consumer data with third parties or using it for AI training, without clear and upfront communication. Surreptitiously making retroactive amendments to terms of service or privacy policies to inform consumers of such changes can result in severe legal consequences.
However, data and AI are symbiotic and essential to each other’s success. AI models rely on vast amounts of data to learn, adapt and improve. Without high-quality, secure data, AI systems cannot function effectively, leading to stunted growth and potential failure. Conversely, AI can enhance data management, providing insights and efficiencies that were previously unattainable. While some organizations have responded to security risks by banning the use of AI tools, it is crucial to prioritize the security and privacy of data as organizations increasingly rely on AI, particularly generative AI, to foster innovation and enhance efficiency.
Clear policies and guidelines for employees. According to an Information Systems Audit and Control Association (ISACA) survey, only 10 percent of organizations have a formal, comprehensive generative AI policy in place. This article explores the top five strategies for your enterprise to protect and secure data when using AI and creating an AI company policy, emphasizing the importance of ethical guidelines, data masking, pseudonymization and transparency.
Ensure your organization is equipped to handle the challenges of generative AI. Our generative AI risk management playbook provides the insights and tools you need to stay secure. Download it today and fortify your data defenses. Access the guide Opens in a new window
Before diving into data protection specifics, it's essential to establish comprehensive ethical and responsible AI usage guidelines. These guidelines should address not only data security risks but also other potential risks throughout the AI lifecycle. By setting clear standards, organizations can ensure that their AI initiatives align with ethical AI implementation principles and regulatory requirements.
One of the most effective ways to safeguard data in generative AI is to avoid using confidential data entirely—either within the LLM training data or as inputs into generative AI tools. By eliminating confidential data from the training and input datasets, organizations can significantly reduce the risk of data breaches and privacy violations.
Practical steps:
Example of when not to use confidential data: In the energy and commodities sector, a generative AI tool designed to predict market trends and prices could avoid using proprietary customer data, such as individual trading strategies, or personal data, such as individual transaction histories, as input or for training the LLM. Using this sensitive data could lead to competitive disadvantages, breaches of confidentiality agreements and potential regulatory violations. Instead, the AI tool could be trained on aggregated, anonymized market data to ensure compliance with privacy standards and to protect the proprietary information of individual customers.
When confidential data is necessary, data masking and pseudonymization are effective techniques to protect it. These methods obfuscate data to help prevent unauthorized access while maintaining its utility for AI applications. Data masking is a technique used to protect confidential data by modifying it in some way.
Data masking techniques:
Pseudonymization is a data protection technique that replaces identifiable information within a dataset with pseudonyms or artificial identifiers. Unlike anonymization, which removes all identifying information, pseudonymization allows the data to be reidentified, if necessary, by using a separate key or mapping system.
Pseudonymization techniques:
Transparency is crucial for building trust in AI systems, but it must be balanced with the need to protect the model's inner workings from misuse. Progressive disclosure, or "detail on demand," is a strategy that allows users to understand AI outputs without revealing too much about the model's internal processes.
How it works:
Example of detail on demand: Consider a generative AI tool used in healthcare for diagnosing diseases based on patient symptoms and medical history. Initially, the AI provides a high-level explanation, such as "Based on the symptoms and medical history, the AI suggests a diagnosis of Disease X." If more information is needed, the healthcare professional can request additional details, like "The AI identified Symptom A and B, common in Disease X, and the patient's history of Condition Y increases the likelihood." This approach maintains transparency, builds trust and protects the model's complex internal workings from potential misuse.
Benefits:
Major technology providers offer advanced data privacy and security solutions that can significantly mitigate data security risks associated with AI. Partnering with these providers can enhance your organization's data protection capabilities. Cloud storage solutions offered by major technology providers are also designed with advanced security measures to protect sensitive data. These providers use encryption, both at rest and in transit, to ensure that data is unreadable to unauthorized individuals. They also offer features like multifactor authentication, access controls and data masking services to further enhance data protection. Storing data in the cloud also allows for real-time monitoring and threat detection. This means that potential security breaches can be identified and addressed promptly, minimizing the risk of data loss or exposure. Moreover, cloud storage solutions are scalable and flexible, allowing organizations to easily adjust their storage capacity as their data needs change. This is particularly important for AI applications, which often require large amounts of data.
Considerations:
Protecting data in the era of AI requires a multifaceted approach that includes ethical guidelines, data avoidance, security controls and balanced transparency. By implementing these strategies, organizations can safeguard sensitive information, comply with regulatory requirements and build trust with customers and stakeholders.
As AI, and particularly generative AI, continue to evolve, the importance of data protection cannot be overstated. By adopting these top five strategies, organizations can navigate the complexities of AI data security, ensuring that their innovations are both effective and ethical. Remember, the goal is not only to protect data but also to foster a culture of trust and responsibility in the AI landscape.
By taking these proactive steps, organizations can harness the power of AI while maintaining the highest standards of data security and privacy.
Trust matters. That’s why Publicis Sapient’s generative AI solutions leverage a deep understanding of data management, ethics, governance and risk to create systems that scale. Grow your business by reducing risk while improving the efficiency of software development and resource allocation. View our generative AI solutions brochure
Sucharita Venkatesh
Senior Director General Management, Publicis Sapient, London, United Kingdom
Let's connect
Todd Cherkasky
GVP Customer Experience & Innovation Consulting, Publicis Sapient, Chicago, IL
Let's connect
Francesca Sorrentino
Client Partner, Publicis Sapient, London, United Kingdom
Let's connect