✍️ The Growing Need to Define a Corporate Security Policy for LLM AI

Author(s): Christian Scott

Over the last few months, the proliferation of large language model artificial intelligence (LLM AI) such as ChatGPT has gained immense popularity, and many employees around the globe are already utilizing LLM AI for daily tasks like writing code, creating reports, and creating new content without any guidance. Without effective corporate guidance on how to safely and responsibly utilize LLM AI, employees can unintentionally expose their organization to many data privacy, intellectual property and cybersecurity challenges.

One specific situation that highlights the risk of submitting sensitive information into LLM AIs like ChatGPT was OpenAI’s data-leakage security incident back in March 2023. In the course of that security incident, the titles of active users’ chat history and the first message of a newly created conversation were exposed in the data breach. The data-leakage vulnerability also exposed payment-related information belonging to 1.2% of ChatGPT Plus subscribers.

Regardless of an organization's opinion, it's clear that boards and management need a policy pertinent to LLM AI acceptable use. With a comprehensive policy, the board can ensure employees utilize LLM AI in a responsible and ethical manner.

🤖What Exactly is ChatGPT and LLM AI?

ChatGPT is a large language model artificial intelligence (LLM AI) developed by OpenAI that was publicly released late last year in November 2022. ChatGPT quickly became the fastest-growing consumer application in recent history, reaching approximately 100 million monthly users in January. Over 13 million people used ChatGPT within a full month of its release which compared to the next fastest growing app, TikTok, which took about nine months to reach a similar user adoption.

ChatGPT is built on the structure of GPT-4. GPT stands for generative pre-trained transformer; this indicates it is a large language model that checks for the probability of what words might come next in sequence. A large language model is a deep learning algorithm, a type of transformer model in which a neural network learns context about any language pattern that might be a spoken language or a computer programming language.

For more general information on ChatGPT, please see: https://www.techrepublic.com/article/chatgpt-cheat-sheet/

📊How are Organizations Currently Leveraging ChatGPT?

Regardless if organizations are intentionally utilizing LLM AI or not, likely their staff already are in a number of ways to reduce time spent on tedious tasks, generate new ideas, and create various content such as:

Generating sales pitches, social media posts, blog articles, and other types of written content
Drafting emails and customer service correspondence
Assisting with the creation of software code
Summarizing reports, notes, and large datasets
Analyzing large datasets and business trends
Education on a variety of topics to explain various concepts related to an employee’s role

While LLM AI solutions have many significant time-saving benefits, but they also come with a whole host of potential risks and ethical considerations.

💭Important Risks to Consider When Utilizing LLM AI

The use of LLM AI has inherent risks that business leaders should thoroughly understand before authorizing the use of LLM AI at their organization.

Data Confidentiality & Privacy Risks:

Information entered into LLM AI may become public or utilized in a training dataset, which could result in the disclosure of sensitive company data. Such disclosures could violate data privacy laws, breach customer contracts, or compromise company trade secrets. The privacy policies of LLM AI solution providers vary and, in many instances, permit the LLM AI solution provider to train their language models on any questions, requests, or data submitted to the LLM AI solution provider.

Accuracy & Quality Control Risks:

LLM AI relies upon algorithms that are trained on limited datasets to generate content. There is a significant risk that LLM AI may generate inaccurate or unreliable, and completely false information, known as hallucinations. Staff members should exercise extreme caution when relying on LLM AI generated content and always review and edit responses for accuracy before utilizing any content.

Intellectual Property Risks:

To the extent that staff members utilize LLM AI to generate any content or code, that content may not be protected by copyright laws in many jurisdictions due to the fact there was no human authorship. As of March 2023, the United States Copyright Office does not recognize LLM AI generated content as copyrightable.

Source: https://www.federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence

Since LLM AI generated content is based on previous training datasets, the content may be considered a derivative work of any copyrighted materials used to train the LLM AI .
To the extent that code, financial data, other trade secrets, or confidential information are submitted to a public LLM AI for analysis, there is a risk that other users and companies that utilize that same LLM AI may be able to access and disclose that sensitive information.
Any software code submitted to or received from LLM AI, such as ChatGPT, may include some open-source derivative references, which may be subject to various open-source license obligations and requirements such as:

The redistribution of open-source code
Limitations on the commercial use of open-source code
Author attribution references the original author of the open-source code.

Bias & Objectionable Content Risks:

LLM AI may produce biased, discriminatory, offensive, or unethical content.
Furthermore, LLM AI may produce content that does not align with the company’s mission, vision, values, and policies.

Data Security Risks:

LLM AI may store and process sensitive data, which could be at risk of being accessed by unauthorized parties, unintentionally leaked, breached, or hacked through various means, such as prompt injection attacks.

🔥Cybersecurity Threats Posed with LLM AI

The proliferation of easily accessible LLM AI like ChatGPT also poses a threat to organizations because it reduces the barrier to entry for malicious actors to attempt more sophisticated attacks.

There are already widespread reports of ChatGPT being utilized to help write code for malware as well write more eloquent social engineering emails. All in all, this means that organizations will need to adapt to a growing number of ever-increasingly sophisticated attacks attempted against their information systems and employees.

Organizations will need to implement further staff training, update company policies/processes, and implement additional technical controls to reinforce a more robust defense-in-depth posture to help mitigate these new threats.

📝What should a ChatGPT or LLM AI Usage Policy Contain?

Organizations should define a comprehensive LLM AI usage policy or update their already existing Acceptable Use Policies (AUP) to provide specific guidance to staff on how to safely leverage LLM AI like ChatGPT.

While organizations could try to implement a total ban on LLM AI, it’s likely impractical given the proliferation of LLM AI as well as all it would require organizations to pass on all of the potential benefits of leveraging LLM AI.

When creating a policy for LLM AI, most organizations should take a mid-line stance of permitting the responsible and restricted use of LLM AI. When organizations define their policy, they should consider the previously noted risks, such as data confidentiality, privacy, quality control, intellectual property, objectionable content, and data security risks.

Effective technical safeguards should also be deployed that align with the organization’s LLM AI policy; for example, the deployment of an easily accessible shortcut to the approved LLM AI solution to each staff workstation and the implementation of outbound traffic blocking of prohibited LLM AI solutions like ChatGPT.

We’ve written a helpful A Sample Company Policy For Large Language Model Artificial Intelligence (LLM AI) with a Creative Commons Attribution 4.0 International License for organizations to utilize as a starting point for defining a comprehensive LLM AI usage policy.

Copyright 2023 Enclave Regenerous. Unless otherwise stated, all of our work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Simply put, please share it, provide attribution and if you remix it then share generously with others. The work of others that is featured on this site is always provided with attribution and is not directly monetized.

Disclaimers:

The opinions expressed here are respectively our own and do not reflect the views of our organization or anyone else unless quoted verbatim.

We try our best to provide helpful insight to folks but there is no warranty to completeness of anything we create or post here; so please be sure to always do your own research.