Notion AI security practices

Notion AI security practices
In this Article

At Notion, we want to be transparent with our customers about our AI products. Here's an outline of their functionality and privacy practices 🔒

The Notion AI Add-On currently includes*:

  • Writer: Generate or modify text on a page by writing a custom prompt, or selecting a pre-configured prompt.

  • Autofill: Generate text across many pages in a database simultaneously by writing a custom prompt, or selecting a pre-configured prompt.

  • Q&A: Get instant answers to your questions, using information from across your Notion workspace.

Notion AI features appear seamlessly in your workspace, but leverage technology from several AI subprocessors to provide you with the service. Go here to see a complete list of our current subprocessors →

*Notion AI will expand to include more features over time.

Who are Notion’s Large Language Model Providers?

Notion currently utilizes large language models (LLMs) provided by Anthropic, OpenAI, and a Notion-hosted model provided by Cohere. Cohere does not store any Customer Data. We continuously evaluate LLM providers and their models to provide the highest quality experience to our Notion AI users. Any third parties that will store Customer Data will be published in our Subprocessor Page.

How do I subscribe to new Subprocessor notifications?

Customers may sign up to receive notification of new Subprocessors by e-mailing with the subject “Subscribe to New Subprocessors.” Once a customer has signed up to receive new Subprocessor notifications, Notion will then provide that customer with notice of any new Subprocessors before authorizing the new Subprocessor to process Customer Data. For additional information, please see our Data Processing Addendum.

How do Writer & Autofill work?

How do Writer & Autofill work?

When you interact with the writing assistant or set up an autofill property, several steps occur in the background:

  1. Notion receives a prompt from a user.

  2. Data relevant to the prompt is sent to a AI LLM Subprocessors, which produces an output to send back to Notion.

  3. Notion then processes the LLM’s output so that it adheres to the right format and language and displays the output to the user.

How is your data protected?

  • When sending data to our AI LLM Subprocessors, the data is encrypted in transit using TLS 1.2 or greater, and no Customer Data is used to train the model.

  • All our AI LLM Subprocessors only retain data for 30 days or less before deletion.

  • Only data the user has access to on the specific page where the AI Writer or Autofill is used will be sent to AI LLM Subprocessors to generate the output, meaning that the generated outputs provided to the user will not incorporate any data that the user did not already have access to.

How does Q&A work?

Notion Q&A works in two key phases:

  1. Creating embeddings

  2. Generating responses

What are embeddings?

Embeddings are numerical representations of text or documents. These representations capture the meaning and context of the text in a multidimensional space, where similar topics have similar numerical representations. By using embeddings, vector search algorithms can efficiently compare and find similarities between different pieces of text or documents. In the case of Notion AI's Q&A feature, embeddings are created from workspace content to enable the system to provide accurate and relevant responses to user questions.

Here is an example of an embedding from OpenAI:


How are embeddings created?

  1. Workspace content is sent to OpenAI to create embeddings.

  2. Notion receives embeddings from OpenAI and stores them in a vector database hosted by Pinecone, which powers the ability to provide responses to questions.

How are embeddings created?

How are embeddings used to generate responses?

  1. Notion receives a question from a user.

  2. The question is passed to an LLM Subprocessors to be rephrased for optimal responses.

  3. The rephrased question is passed to Pinecone, where a list of relevant pages is found.

  4. Notion sends the question — and the pages identified by Pinecone — to a Notion-hosted LLM where the pages are refined and ranked by relevance.

  5. The question, refined list of pages, and ranking of pages are processed by our LLM Subprocessors.

  6. Notion processes the output to adhere to the right format and language and displays the output to the user.

How are embeddings used to generate responses?

How are embeddings protected?

Despite embeddings being a numerical representation of Customer Data, Notion still treats embeddings with the same level of security and privacy considerations as Customer Data. All our Customer Data commitments outlined in our Master Service Agreement (MSA) and Data Processing Agreements (DPA) apply to embeddings.

We store embeddings with Pinecone. Pinecone has been vetted by our security team as well as an external auditor to obtain their SOC2 Type II certification. Learn more about Pinecone’s security here →

Does Notion AI respect existing permissions?

Yes, Notion AI honors existing permissions. Users will not be able to generate content or receive Q&A responses based on resources they do not have access to.

How is Customer Data protected when sent to AI Subprocessors?

Notion AI is designed to protect your Customer Data, and prevent information leaks to other users of the service.

Prior to engaging any third-party Subprocessor or vendor, Notion evaluates their privacy, security, and confidentiality practices, and executes an agreement implementing its applicable security, privacy and legal obligations. All Subprocessors are monitored and reviewed at least annually to ensure continual compliance with Notion’s Subprocessors. This includes reviewing documents such as attestation reports, penetration tests, and other artifacts based on the subprocessor’s criticality and other risk factors. As part of the onboarding and ongoing reviews, questionnaires are distributed to vendors and are required to be completed. Significant public security events are also assessed to protect the supply chain attack surface.

When we send your Customer Data to third parties, it’s encrypted in-transit using TLS 1.2 or greater.

For more information about how Notion processes your data, please refer to the Data Processing Addendum.

Will our data be used to train any models?

We have contractual agreements with our AI Subprocessors that prohibit the use of Customer Data to train their models.

Your use of Notion AI does not grant Notion any right or license to your Customer Data to train our machine learning models.

How is Customer Data segregated?

Individual customer accounts are kept separate in our production environment. We do not mix or process together data from different customers during AI processing. This means we do not expose your data to other Notion customers.

What are the data retention obligations of third-party AI providers?

Notion AI Subprocessors have data retention policies that allow Notion to meet our obligations to customers for the processing of data.

When using Notion AI Writer, Autofill and Q&A, OpenAI and Anthropic only retain Customer Data for 30 days or less before deletion. Notion's Q&A product is additionally powered by OpenAI's embeddings; OpenAI does not retain any Customer Data through their embeddings service.

Embeddings stored in Pinecone are deleted within 60 days from when the page or workspace is deleted.

If a user deletes a Notion page or Notion workspace, we can restore the content within 30 days. After 30 days, the data is deleted and unrecoverable, this includes any AI-generated data and embeddings. For more detail about deleting and restoring your data, please refer to this page in our Help Center →

What compliance standards does Notion AI meet?

Notion AI is included in the scope of Notion’s SOC 2 Type 2 report and ISO 27001 certification, demonstrating our commitment to various regulatory and industry standards.

We are actively working to enable Notion AI to meet HIPAA requirements by utilizing LLM provider’s zero-retention APIs and allow for the processing of protected health information (PHI).

Can data loss prevention (DLP) be configured to alert for data being used by Notion AI?

Customers can trigger data loss prevention (DLP) alerts for sensitive content in their Notion workspace using third-party integration partners on our Enterprise plan. That will include content in an AI prompt and the content generated by AI. Learn more about our DLP integration here →

Is it possible to prevent data from being sent to Notion AI Subprocessors?

If you’re a workspace owner in an Enterprise Plan workspace, you can prevent data from being sent to AI Subprocessors in Settings & members, by turning off the Notion AI feature toggle. You’ll need to be on desktop to do this.

Turning off this functionality in your workspace will not delete any of your existing data, including any content that may have been generated by Notion AI while the feature was turned on. Everyone in your workspace, including workspace owners, membership admins, and members will no longer have access to Notion AI features in the workspace.

Are there rules against what I can do with Notion AI?

The Notion AI Supplementary Terms apply to your usage of Notion AI. In addition, Notion’s Content & Use Policy applies to any content on Notion, including content generated by Notion AI. Violating these terms can result in removal of your content, or suspension of access to your workspace.

Who owns the rights to content generated by Notion AI?

Notion does not claim ownership of your input or the generated output. This is addressed in the Notion AI Supplementary Terms in the "Input and Output" section:

You may provide input to be processed by Notion AI (“Input”), and receive output generated and returned by Notion AI based on the Input (“Output”). When you use Notion AI, Input and Output are your Customer Data.

You may also wish to reference our standard data protection practices.

Give Feedback

Was this resource helpful?

Powered by Fruition