Support Wiki
Comprehensive documentation for the Loose Lips Sink Ships Chrome extension. Covers installation, configuration, enterprise deployment, and every setting available.
Getting Started
Loose Lips Sink Ships is a Chrome browser extension that detects and redacts personally identifiable information (PII) in text, documents, and images. It is designed for Australian government and enterprise environments where PII must never leave the browser unredacted.
Requirements
- Google Chrome 128+ (Chrome 136+ recommended for best Gemini Nano stability)
- Gemini Nano (optional) — enable in
chrome://flagsfor on-device AI detection - Tesseract.js + pdf.js (optional) — run
vendor/setup.shfor PDF rendering and image OCR - Azure subscription (optional) — only needed for Azure PII or Purview DLP integration
Installation
From Chrome Web Store
- Visit the Chrome Web Store listing
- Click Add to Chrome
- Confirm the permissions dialog
- The extension icon appears in your toolbar
Developer / Side-load Install
- Clone the repository:
git clone https://github.com/bmaidev/looselipssinkships.extension.git - Run
vendor/setup.shto download Tesseract.js and pdf.js dependencies - Open
chrome://extensionsand enable Developer mode - Click Load unpacked and select the repository root folder
- Verify the extension icon appears in your toolbar
Enable Gemini Nano (Optional)
- Navigate to
chrome://flags - Enable
#optimization-guide-on-device-model— set to Enabled BypassPerfRequirement - Enable
#prompt-api-for-gemini-nano— set to Enabled - Restart Chrome
- The green dot in the extension popup confirms Nano is available
Quick Start
After installation, the extension is active on all pages in Auto mode. Here is what happens by default:
- A green Redact button appears on text fields and textareas
- On AI chat sites (ChatGPT, Claude, Gemini, etc.), auto-redact and submit interception are active
- Click the toolbar icon to see the popup with quick stats and detection status
- Open the Settings page from the popup to configure providers, site filtering, and more
Detection Modes
The detectionMode setting controls which engines are used and in what order. The extension supports five modes, each with a different privacy and capability trade-off.
Auto Default
Progressive escalation through all available tiers: Local → Hosted → Cloud.
Fallback chain
- Gemini Nano (on-device AI) — tried first if enabled and available
- Regex + Vector — always-available local fallback; runs if Nano is unavailable or fails
- Hosted models — tried if configured and the above steps produced no result
- Cloud providers (Azure PII, Purview) — tried only if no hosted provider could handle the request
If regexAfterAi is enabled (belt-and-suspenders mode), regex also runs as a second pass after Nano to catch anything the AI missed.
Best for: Most deployments. Maximises privacy by preferring local processing, but can escalate when higher-quality detection is needed.
AI Only
Uses only AI-capable providers. Will not fall back to regex.
Fallback chain
- Gemini Nano (local AI)
- Hosted AI models (organisation-hosted LLMs with text capability)
- Cloud AI (Azure PII, which uses NLP-based entity recognition)
Throws an error if no AI provider is available.
Best for: Environments where regex false positives are unacceptable and AI-quality detection is required.
Regex Only
Uses only the built-in regex pattern engine. No data leaves the device under any circumstances.
Fallback chain
- Regex engine — the sole provider; no escalation
Best for: Air-gapped networks, highest-security environments, or when deterministic pattern matching is preferred over probabilistic AI.
Hosted Only
Routes all text through organisation-hosted models. Falls back to regex if no hosted provider is available.
Fallback chain
- Hosted models (e.g., on-premises LLM)
- Regex engine (fallback if no hosted provider is reachable)
Best for: Organisations that operate their own AI infrastructure and want all detection to go through their own endpoints.
Cloud Only
Sends text directly to cloud providers. Falls back to regex if no cloud provider is available or configured.
Fallback chain
- Cloud providers (Azure PII, Purview)
- Regex engine (fallback if cloud is unavailable)
Best for: Environments where Azure PII accuracy is paramount and cloud data transmission is acceptable. Note that text is sent to external Azure endpoints.
Providers
Providers are organised into three tiers: Local, Hosted, and Cloud. The pipeline always prefers the highest-privacy tier first.
Gemini Nano Local
Chrome's built-in on-device language model. Runs entirely within the browser with no network requests.
- Tier: Local
- Requires: Chrome 128+, flags enabled (see Installation)
- Capabilities: Text PII detection via prompt-based entity extraction
- Session timeout: Configurable via
sessionTimeoutMinutes(default 5 min) - Limitations: Smaller context window than cloud models; may miss complex or ambiguous PII
Regex Engine Local
Pattern-based detection using the 17 built-in Australian PII type definitions. Always available as a fallback.
- Tier: Local
- Requires: Nothing — always available
- Capabilities: Text PII detection with character-position-accurate matches
- Strengths: Deterministic, fast, no false negatives for well-structured PII (phone numbers, TFNs, etc.)
- Limitations: Cannot detect PII by context alone (e.g., an unstructured name in prose)
Vector Names Local
Cosine-similarity name detection using a vector embedding approach. Runs alongside regex when enableVectorFallback is true.
- Tier: Local
- Threshold: Configurable via
vectorThreshold(default 0.45, range 0–1) - Strengths: Catches personal names that regex cannot match (fuzzy matching)
- Limitations: Higher false-positive rate than regex; threshold tuning may be required
Azure PII Cloud
Azure AI Language PII detection service. Supports both API key (Direct) and OAuth2 (Enterprise) authentication.
Direct Mode (API Key)
| Field | Description |
|---|---|
endpoint | Azure Language resource endpoint URL (e.g., https://myresource.cognitiveservices.azure.com) |
apiKey | API key from the Azure portal — stored in chrome.storage.local |
apiVersion | API version string (default 2023-04-01) |
language | Language code (default en) |
domain | Detection domain: none (general) or phi (health) |
categoriesFilter | Array of entity categories to detect (empty = all) |
Enterprise Mode (OAuth2)
| Field | Description |
|---|---|
connectionMode | Set to enterprise |
tenantId | Azure AD tenant ID (GUID) |
clientId | App Registration client ID (GUID) |
endpoint | Azure Language resource endpoint URL |
In Enterprise mode, the user signs in via Azure AD popup (PKCE flow). No API key is stored. See Azure OAuth2 for full setup.
Native Document Redaction
When nativeDocEnabled is true, Azure handles OCR, text extraction, PII detection, and document reconstruction server-side. Supports PDF, DOCX, and TXT files.
| Field | Description |
|---|---|
sourceContainerUrl | Blob storage URL for uploads. Direct: URL with SAS (read+list). Enterprise: plain URL. |
targetContainerUrl | Blob storage URL for results. Direct: URL with SAS (write+list). Enterprise: plain URL. |
redactionPolicy | characterMask, entityMask, or noMask |
Purview DLP Cloud
Microsoft Purview Data Loss Prevention via the Microsoft Graph API. Provides sensitivity label evaluation and DLP policy checking.
| Field | Description |
|---|---|
tenantId | Azure AD tenant ID |
clientId | App Registration client ID |
clientSecret | Client secret (for application auth) |
accessToken | Pre-obtained access token (alternative to client secret) |
graphEndpoint | Graph API endpoint (default https://graph.microsoft.com/beta) |
useDelegatedAuth | Use delegated (user) auth instead of application auth |
dlpPolicyId | Specific DLP policy ID to evaluate against |
sensitiveInfoTypes | Array of sensitive information type IDs to detect |
The Purview provider also powers the Classification Gate.
Hosted Models Hosted
Organisation-hosted LLMs or OCR endpoints. Multiple instances can be configured. The extension connects to your own infrastructure.
| Field | Description |
|---|---|
providerId | Unique identifier (e.g., agency_llm) |
displayName | Human-readable name shown in UI |
endpoint | API base URL (e.g., https://internal-llm.agency.gov.au/v1) |
apiKey | API key for authentication |
apiMode | openai (chat completions API) or ocr_extract (OCR endpoint) |
model | Model identifier (e.g., llama-3-70b) |
maxTokens | Maximum tokens for response (default 4096) |
temperature | Sampling temperature (default 0.0 for deterministic output) |
timeout | Request timeout in milliseconds (default 30000) |
customHeaders | Additional HTTP headers as key-value object |
capabilities | Array of capabilities: ["text"], ["text","document"] |
priority | Numeric priority (higher = tried first, default 30) |
// Example hosted provider configuration
{
"enabled": true,
"providerId": "agency_llm",
"displayName": "Agency LLM (On-Prem)",
"endpoint": "https://internal-llm.agency.gov.au/v1",
"apiKey": "",
"apiMode": "openai",
"model": "llama-3-70b",
"maxTokens": 4096,
"temperature": 0.0,
"timeout": 30000,
"customHeaders": {},
"capabilities": ["text"],
"priority": 30
}
PII Types
The extension ships with 17 built-in PII type definitions optimised for Australian data formats. Each type has an ID, display name, regex pattern, redaction token, and category.
| ID | Name | Token | Category | Description |
|---|---|---|---|---|
email | Email Address | [REDACTED_EMAIL] | Contact | Standard email addresses (user@domain.tld) |
phone_au | Phone Number (AU) | [REDACTED_PHONE] | Contact | Australian phone numbers: 04xx, (0x) xxxx xxxx, +61 |
phone_intl | Phone Number (Intl) | [REDACTED_PHONE] | Contact | International phone numbers with country code |
tfn | Tax File Number | [REDACTED_TFN] | Financial | Australian 8-9 digit tax file number |
abn | Australian Business Number | [REDACTED_ABN] | Financial | 11-digit ABN |
acn | Australian Company Number | [REDACTED_ACN] | Financial | 9-digit ACN (with ACN prefix) |
medicare | Medicare Number | [REDACTED_MEDICARE] | Health | Australian Medicare card number (10-11 digits) |
credit_card | Credit Card Number | [REDACTED_CARD] | Financial | Visa, Mastercard, Amex card numbers |
bsb | BSB Number | [REDACTED_BSB] | Financial | Bank state branch number (6 digits with dash) |
bank_account | Bank Account Number | [REDACTED_ACCOUNT] | Financial | 6-10 digits following account/acct context |
passport | Passport Number | [REDACTED_PASSPORT] | Identity | Australian passport (letter + 7 digits) and common formats |
drivers_licence | Driver Licence Number | [REDACTED_LICENCE] | Identity | Australian driver licence numbers (varies by state) |
dob | Date of Birth | [REDACTED_DOB] | Identity | Common date formats with DOB context |
street_address | Street Address | [REDACTED_ADDRESS] | Contact | Australian-style street addresses (number + street name + type) |
postcode_au | Australian Postcode | [REDACTED_POSTCODE] | Contact | 4-digit postcode when preceded by state abbreviation |
ip_address | IP Address (v4) | [REDACTED_IP] | Technical | IPv4 addresses |
ipv6_address | IPv6 Address | [REDACTED_IP] | Technical | IPv6 addresses |
Custom PII Types
Add your own PII types via the Settings page or managed configuration. Custom types are appended to the built-in list.
// Example custom type in settings
{
"customPiiTypes": [
{
"id": "custom_employee_id",
"name": "Employee ID",
"description": "Agency employee identifiers (EMP-XXXXX)",
"token": "[REDACTED_EMPID]",
"enabled": true,
"regex": "EMP-\\d{5}",
"flags": "gi",
"category": "custom"
}
]
}
Requirements for custom types:
- id — unique string identifier (auto-generated as
custom_+ name if not provided) - name — human-readable display name
- token — redaction replacement text, must be in
[BRACKETS] - regex — valid JavaScript regular expression (tested before saving)
- flags — regex flags (typically
gifor global case-insensitive)
Type Overrides
Override properties of built-in types without replacing them entirely. Useful for disabling specific types or changing their tokens.
// Disable TFN detection and change email token
{
"piiTypeOverrides": {
"tfn": { "enabled": false },
"email": { "token": "[EMAIL_REMOVED]" }
}
}
Overrides are merged at load time. The id field can never be overridden.
Text Protection
Three complementary systems protect text input across all web pages. They can be used independently or together.
Auto-Redact
Automatically scans and redacts PII as users type or paste content, without requiring a button click.
| Setting | Type | Default | Description |
|---|---|---|---|
autoRedact.enabled | boolean | false | Master toggle. Off by default; auto-enabled on AI chat sites if aiChatGuard.autoRedactOnAiSites is true. |
autoRedact.debounceMs | number | 2500 | Milliseconds to wait after typing pauses before scanning. |
autoRedact.onPaste | boolean | true | Immediately redact pasted content (no debounce). |
autoRedact.onlyOnAiSites | boolean | true | Only auto-redact on detected AI chat sites (safer default). |
Submit Guard
Intercepts form submissions and Enter-key presses to ensure redaction completes before data is sent.
| Setting | Type | Default | Description |
|---|---|---|---|
submitGuard.enabled | boolean | true | Master toggle for submit interception. |
submitGuard.interceptEnter | boolean | true | Intercept Enter key in textareas (common in chat UIs). |
submitGuard.interceptFormSubmit | boolean | true | Intercept HTML form submit events. |
submitGuard.interceptSendButtons | boolean | true | Intercept click events on send/submit buttons. |
submitGuard.onlyOnAiSites | boolean | true | Only active on detected AI chat sites by default. |
Upload Guard
Intercepts file input changes to scan and redact documents before they are uploaded.
| Setting | Type | Default | Description |
|---|---|---|---|
uploadGuard.enabled | boolean | true | Master toggle for upload interception. |
uploadGuard.onlyOnAiSites | boolean | true | Only active on detected AI chat sites by default. |
uploadGuard.confirmBeforeUpload | boolean | true | Show a confirmation dialog before allowing upload after redaction. |
AI Chat Guard
The AI Chat Guard is a specialised protection layer that activates automatically when you visit any known AI chat service. It ensures PII is redacted before text can be submitted to AI platforms.
How It Works
When you navigate to a page matching a known AI chat domain:
- The extension detects the AI service via URL pattern matching against 30+ built-in domains
- A protection banner is injected at the top of the page indicating the guard is active
- Auto-redact activates on all text fields (if
autoRedactOnAiSitesis enabled) - Submit interception blocks form submission and Enter key until redaction completes
- Upload interception scans files before they can be attached
Services Detected
The following AI chat services are detected automatically:
| Service | Domains |
|---|---|
| OpenAI / ChatGPT | chat.openai.com, chatgpt.com, *.openai.com |
| Anthropic / Claude | claude.ai, *.anthropic.com |
| Google Gemini | gemini.google.com, aistudio.google.com, bard.google.com, notebooklm.google.com |
| Microsoft Copilot | copilot.microsoft.com, copilot.cloud.microsoft, *.copilot.microsoft.com, bing.com/chat, edgeservices.bing.com |
| X / Grok | grok.x.ai, x.com/i/grok |
| Meta AI | meta.ai, *.meta.ai |
| Perplexity | perplexity.ai, *.perplexity.ai |
| Mistral | chat.mistral.ai, *.mistral.ai |
| HuggingChat | huggingface.co/chat |
| Cohere | coral.cohere.com, dashboard.cohere.com |
| DeepSeek | chat.deepseek.com, *.deepseek.com |
| Poe | poe.com |
| You.com | you.com |
| Pi | pi.ai |
| Character.AI | character.ai, *.character.ai |
| Replika | replika.ai |
| Jasper | app.jasper.ai |
| Inflection | inflection.ai |
| Local / Self-hosted | localhost, 127.0.0.1 |
Custom Domains
Add additional domains to the AI chat guard via the aiChatGuard.customAiDomains array:
{
"aiChatGuard": {
"customAiDomains": [
"internal-ai.agency.gov.au",
"*.llm.myorg.com"
]
}
}
Protection Banner
When the AI Chat Guard is active, a banner is injected at the top of the page. The banner:
- Indicates that PII protection is active
- Is controlled by the
aiChatGuard.showAiBannersetting (defaulttrue) - Cannot be dismissed by the page's JavaScript (injected via content script)
Document Processing
The extension processes 50+ file formats. Documents are handled by format-specific processors with a shared pipeline.
DOCX (Word)
DOCX files are ZIP archives containing XML. The extension:
- Unzips the DOCX container
- Parses the XML content (
word/document.xml) - Runs PII detection on extracted text
- Performs in-place XML replacement — PII tokens replace original text within the XML nodes
- Re-zips to produce a valid
.docxfile with formatting preserved
DOCX redaction preserves fonts, styles, tables, headers, footers, and other formatting. The output is a fully valid Word document.
XLSX / CSV
XLSX: Extracted as XML (similar to DOCX), converted to CSV, and redacted as text. The output is a redacted CSV file (format conversion preserves data but not Excel formatting).
CSV: Processed as plain text with PII detection applied to cell contents.
PDFs are handled through multiple paths depending on the document type:
Text-based PDFs
- pdf.js renders each page in an offscreen document
- Text is extracted with character positions
- PII detection runs on the extracted text
- If bounding boxes are available, black rectangles are drawn over PII regions
- Pages are flattened to images to prevent hidden text layer extraction
Scanned PDFs (OCR)
- pdf.js renders each page as a canvas image
- Tesseract.js OCR extracts text with word-level bounding boxes
- PII detection maps matches to OCR bounding boxes
- Black rectangles are drawn over PII regions at the pixel level
- Redacted pages are reassembled into a new PDF
Azure Native Document (Cloud)
When nativeDocEnabled is true, PDFs can be uploaded to Azure for server-side processing:
- File is uploaded to Azure Blob Storage (source container)
- Azure Language API processes the document (OCR + PII detection + redaction)
- Redacted file is downloaded from the target container
Images
PNG, JPEG, GIF, WebP, BMP, and TIFF images are supported:
- Image is loaded into an offscreen canvas
- Tesseract.js extracts text with word-level bounding boxes
- PII matches are mapped to bounding box coordinates
- Black rectangles are drawn over PII regions
- Redacted image is exported as PNG
There is no hidden text layer in redacted images. PII is permanently destroyed at the pixel level.
ZIP Archives
ZIP files are processed recursively:
- Archive is extracted to enumerate all entries
- Each text-based entry is individually redacted (using regex)
- Binary entries (images, PDFs) are kept as-is
- Redacted entries are repacked into a new ZIP archive
Text-based Files
The following file types are processed as plain text with PII detection applied directly:
.txt, .csv, .json, .jsonl, .xml, .yaml, .yml, .toml, .ini, .cfg, .conf, .env, .log, .md, .rst, .adoc, .tex, .rtf, .html, .htm, .py, .rb, .php, .js, .ts, .jsx, .tsx, .java, .cs, .go, .rs, .swift, .kt, .scala, .c, .cpp, .h, .hpp, .sql, .sh, .bash, .zsh, .ps1, .r, .m, .pl, and more.
Site Filtering
Control which websites the extension activates on. Three modes are available:
| Mode | Behaviour |
|---|---|
all Default | Extension is active on all sites |
whitelist | Extension is active only on listed sites |
blacklist | Extension is active everywhere except listed sites |
If a site is filtered out (blacklisted or not whitelisted), the AI Chat Guard is also disabled on that site. The extension is fully inactive.
Pattern Syntax
Patterns support several formats:
| Pattern | Example | Matches |
|---|---|---|
| Exact domain | example.com | Only example.com |
| Wildcard subdomain | *.example.com | sub.example.com, a.b.example.com, example.com |
| Suffix match | *.gov.au | All Australian government sites |
| Full URL prefix | https://portal.agency.gov.au/* | All pages under that path |
| Dot prefix | .gov.au | Any hostname ending in .gov.au |
// Whitelist example: only government sites
{
"siteFilter": {
"mode": "whitelist",
"whitelist": [
"*.gov.au",
"*.defence.gov.au",
"https://portal.internal.agency.gov.au/*"
]
}
}
// Blacklist example: disable on internal tools
{
"siteFilter": {
"mode": "blacklist",
"blacklist": [
"jira.internal.com",
"confluence.internal.com"
]
}
}
Standalone Tool (redact.html)
The extension includes a standalone redaction tool at redact.html in the extension root. It provides a full-page interface for bulk document redaction.
How to Access
- Click the extension toolbar icon and select Open Redact Tool
- Or navigate directly to
chrome-extension://<extension-id>/redact.html
Features
- Drag-and-drop — drop files directly onto the page for processing
- Batch processing — queue multiple files; they are processed sequentially
- Download results — redacted files are available for download individually
- Text input — paste text directly for instant redaction
- Supports all document types the extension handles (DOCX, PDF, images, text files, ZIP)
Classification Gate
The Classification Gate prevents classified or sensitive documents from being uploaded to external services. It reads Microsoft Information Protection (MSIP) sensitivity labels embedded in DOCX and XLSX files.
MSIP Label Detection
OOXML files (DOCX, XLSX) store MSIP labels in docProps/custom.xml as key-value properties:
MSIP_Label_{GUID}_Enabled = true
MSIP_Label_{GUID}_Name = "PROTECTED"
MSIP_Label_{GUID}_SiteId = {tenant GUID}
MSIP_Label_{GUID}_Method = Standard | Privileged
MSIP_Label_{GUID}_ContentBits = 8
The gate reads these properties locally (Layer 1) without any API call. An optional Layer 2 check uses the Microsoft Graph API for deeper policy evaluation.
Gate Evaluation Flow
- Extract MSIP labels from the file's custom properties (local, instant)
- Check if any label matches the
blockedLabelslist — if yes, block the upload - Check if the file is encrypted (ContentBits & 0x8) and
blockEncryptedis true — if yes, block - Check if any label matches the
warnLabelslist — if yes, show warning but allow - Optionally query Graph API for additional policy evaluation (
checkViaApi) - If no restrictions are found, allow the upload
Australian PSPF Classification Table
The following table shows the recommended gate configuration aligned with the Australian Government Protective Security Policy Framework (PSPF):
| Classification | Gate Action | Rationale |
|---|---|---|
| UNOFFICIAL | Allow | No protective marking. No restrictions on sharing. |
| OFFICIAL | Allow | Standard business information. Low business impact if compromised. |
| OFFICIAL: Sensitive | Warn | Sensitive business information. Caution advised before sharing externally. User is warned but may proceed. |
| PROTECTED | Block | Valuable or sensitive information. Must not leave controlled environment. Upload is blocked with no override. |
| SECRET | Block | Classified material. Compromise could cause serious damage to national security. Upload is blocked absolutely. |
| TOP SECRET | Block | Classified material. Compromise could cause exceptionally grave damage to national security. Upload is blocked absolutely. |
// Recommended enterprise configuration
{
"providers": {
"purview": {
"gateEnabled": true,
"blockedLabels": ["PROTECTED", "SECRET", "TOP SECRET"],
"warnLabels": ["OFFICIAL:Sensitive", "OFFICIAL: Sensitive"],
"blockEncrypted": true
}
}
}
Encrypted File Blocking
When blockEncrypted is true (default), files with the MSIP encryption bit set (ContentBits & 0x8) are blocked regardless of their label name. This prevents Rights Management (RMS) protected documents from being uploaded.
Enterprise Deployment
The extension supports four deployment methods for organisations of all sizes. Each pushes configuration to chrome.storage.managed (or equivalent), which the extension reads at startup.
Microsoft Intune
For organisations using Microsoft Endpoint Manager / Intune to manage Chrome on Windows, macOS, or ChromeOS devices.
- In the Microsoft Intune admin center, navigate to Devices → Configuration profiles
- Create a new profile: Windows 10 and later → Templates → Administrative templates (custom)
- Add an OMA-URI setting for Chrome managed extensions:
./Device/Vendor/MSFT/Policy/Config/Chrome~Policy~googlechrome~Extensions/ExtensionSettings - Set the value to a JSON string containing your extension ID and managed storage policy:
{ "<extension-id>": { "installation_mode": "force_installed", "update_url": "https://clients2.google.com/service/update2/crx", "managed_storage": { "detectionMode": "auto", "providers": { "azure_pii": { "enabled": true, "connectionMode": "enterprise", "tenantId": "YOUR_TENANT_ID", "clientId": "YOUR_CLIENT_ID", "endpoint": "https://your-resource.cognitiveservices.azure.com" }, "purview": { "gateEnabled": true, "blockedLabels": ["PROTECTED", "SECRET", "TOP SECRET"], "warnLabels": ["OFFICIAL:Sensitive"], "blockEncrypted": true } }, "siteFilter": { "mode": "whitelist", "whitelist": ["*.gov.au"] }, "_locked": { "detectionMode": true, "providers.azure_pii": true, "providers.purview.gateEnabled": true, "siteFilter": true }, "_policyVersion": "2024.10.1" } } } - Assign the profile to the target device groups
- Devices receive the policy on next sync (typically within 15 minutes)
Group Policy (GPO)
For organisations using Active Directory Group Policy to manage Chrome on Windows devices.
- Download the Chrome ADMX templates and add to your Central Store
- Open Group Policy Management Editor
- Navigate to Computer Configuration → Administrative Templates → Google Chrome → Extensions
- Open Configure extension management settings
- Paste the same JSON structure as the Intune example above, with your extension ID as the key
- For managed storage specifically, create a registry key at:
HKLM\SOFTWARE\Policies\Google\Chrome\3rdparty\extensions\<extension-id>\policy - Add REG_SZ values for each top-level setting, or use a single JSON blob via the
ExtensionSettingspolicy - Run
gpupdate /forceon target machines or wait for the next GP refresh cycle
Google Workspace
For organisations using Google Workspace (formerly G Suite) to manage Chrome on ChromeOS or managed Chrome browsers.
- Sign in to the Google Admin Console at
admin.google.com - Navigate to Devices → Chrome → Apps & extensions
- Select the target organisational unit (OU)
- Click the + button and add the extension by ID from the Chrome Web Store
- Set installation policy to Force install
- Under Policy for extensions, paste the managed storage JSON configuration
- Save and wait for devices in the OU to sync (typically within a few hours)
Small Business (Remote Config)
For organisations without MDM or managed Chrome browsers. Uses a JSON file hosted on any HTTPS server.
- Create a JSON configuration file with your desired settings (same schema as managed storage)
- Host it on any HTTPS endpoint accessible to your users (e.g.,
https://config.yourcompany.com/llss-config.json) - In the extension settings, set the Config URL field to the JSON URL:
{ "configUrl": "https://config.yourcompany.com/llss-config.json" } - The extension fetches the config on startup and caches it in
chrome.storage.local - Remote config supports the same
_lockedand_policyVersionfields as managed storage - To update settings, simply update the hosted JSON file — the extension will pick up changes on next restart
Remote config has lower precedence than managed storage but higher than local user settings. If both are present, managed storage wins.
Managed Configuration
The extension merges settings from four sources in a strict precedence order.
Merge Precedence
Settings are merged from lowest to highest priority:
- Defaults — hardcoded
DEFAULT_SETTINGSin the extension code - Local user settings — stored in
chrome.storage.local(user-editable) - Remote config — fetched from
configUrl(small business deployment) - Managed storage — pushed via
chrome.storage.managed(enterprise MDM) — highest priority
Each layer deep-merges into the previous. Arrays are replaced, not concatenated. The final merged object includes a _meta property with lock information.
Locked Settings (_locked)
The _locked map is an object where keys are dot-separated setting paths and values are booleans. Locked settings:
- Cannot be changed by the user in the Settings UI (controls are disabled with an "Admin" badge)
- Are stripped from local storage on save (so the managed value always wins on reload)
- Support parent-path locking — locking
aiChatGuardlocks all sub-keys likeaiChatGuard.enabled
{
"_locked": {
"detectionMode": true,
"providers.azure_pii": true,
"aiChatGuard": true,
"siteFilter.mode": true
}
}
Policy Version (_policyVersion)
An optional string (e.g., "2024.10.1") attached to the merged _meta. Useful for administrators to track which policy version is deployed. Displayed in the extension's Settings page for troubleshooting.
Schema Versions
The extension migrates settings automatically as the schema evolves:
| Version | Changes |
|---|---|
| v1 | Initial schema. Basic detection settings, PII types, UI toggles. |
| v2 | Added providers object (local_ai, regex, azure_pii, purview, hosted). Added enableDocumentUpload. |
| v3 | Added siteFilter, aiChatGuard, autoRedact, submitGuard, uploadGuard sections. |
| v4 | Migrated useManagedIdentity boolean to connectionMode string ("direct"/"enterprise") in Azure PII. |
| v5 | Added Classification Gate fields to Purview: gateEnabled, blockedLabels, warnLabels, blockEncrypted. |
Current schema version: 5 (stored as schemaVersion: 5 in settings). Migration is automatic and non-destructive.
Azure OAuth2
Enterprise mode uses Azure AD OAuth2 with PKCE for secure, credential-free authentication to Azure Cognitive Services.
App Registration Setup
- In the Azure Portal, go to Azure Active Directory → App registrations → New registration
- Set the name (e.g., "LLSS Chrome Extension") and select Single tenant or Multi-tenant as needed
- Set the Redirect URI type to Web and enter the value from
chrome.identity.getRedirectURL()
(format:https://<extension-id>.chromiumapp.org/) - Under Authentication, enable Allow public client flows (required for PKCE)
- Under API permissions, add Azure Cognitive Services → user_impersonation (delegated)
- If using Azure Blob Storage for native documents, also add Azure Storage → user_impersonation
- Grant admin consent for the tenant
- Copy the Application (client) ID and Directory (tenant) ID into the extension settings
PKCE Flow
The extension uses the Authorization Code Flow with Proof Key for Code Exchange (PKCE):
- Code verifier — 32 random bytes, base64url-encoded
- Code challenge — SHA-256 hash of the verifier, base64url-encoded (S256 method)
- Authorization request —
chrome.identity.launchWebAuthFlow()opens the Azure AD login popup with the challenge - Authorization code — extracted from the callback URL, validated against the state parameter
- Token exchange — POST to
/oauth2/v2.0/tokenwith the code and verifier - Tokens stored — access token, refresh token, and ID token saved to
chrome.storage.session(memory-only)
Tokens are stored in chrome.storage.session, which is memory-only and cleared when the browser closes. No credentials are ever written to disk.
Token Management
- Auto-refresh: An alarm is scheduled to refresh the access token 5 minutes before expiry
- Scope:
https://cognitiveservices.azure.com/.default offline_access openid profile - Alternate scopes: For Azure Storage access, the extension requests
https://storage.azure.com/.defaultusing the stored refresh token - Token cache: Alternate scope tokens are cached in
chrome.storage.sessionwith expiry tracking - Failure recovery: If refresh fails, tokens are cleared and the user must sign in again
Sign In / Sign Out
// Sign in (from background script or popup)
const userInfo = await LLSS_AzureAuth.signIn(tenantId, clientId);
// Returns: { name: "Jane Smith", email: "jane@agency.gov.au", tenantId: "..." }
// Get token for API calls
const token = await LLSS_AzureAuth.getAccessToken(tenantId, clientId);
// Check status
const signedIn = await LLSS_AzureAuth.isSignedIn();
// Sign out (clears all tokens)
await LLSS_AzureAuth.signOut();
Managed Identity for Storage
In Enterprise mode with native document redaction:
- The Azure Language resource uses a managed identity to access Blob Storage (no SAS tokens needed)
- Admin grants the Language resource Storage Blob Data Reader on the source container and Storage Blob Data Contributor on the target container
- The extension uses the user's OAuth2 token (with storage scope) to upload/download files
- The Language resource accesses the same containers via its managed identity when processing the job
Settings Reference
Complete reference for every setting in DEFAULT_SETTINGS. Settings are organised by section.
Detection
| Setting | Type | Default | Description |
|---|---|---|---|
detectionMode | string | "auto" | Detection mode: "auto", "ai_only", "regex_only", "cloud_only", or "hosted_only". |
enableVectorFallback | boolean | true | Run the vector name detector alongside regex fallback. |
vectorThreshold | number | 0.45 | Confidence threshold for vector name detection (0 to 1). |
regexAfterAi | boolean | false | Belt-and-suspenders: also run regex after AI to catch anything missed. |
customPiiTypes | array | [] | Custom PII type definitions (merged with built-in defaults). |
piiTypeOverrides | object | {} | Override properties on built-in PII types (e.g., { "email": { "enabled": false } }). |
sessionTimeoutMinutes | number | 5 | Idle timeout in minutes before destroying the Gemini Nano AI session. |
UI
| Setting | Type | Default | Description |
|---|---|---|---|
showRedactButton | boolean | true | Show the green Redact button on all textareas and text inputs. |
enableContentEditable | boolean | true | Also attach to contenteditable elements (rich text editors). |
enableDocumentUpload | boolean | true | Show the document upload button alongside the text redact button. |
Site Filter
| Setting | Type | Default | Description |
|---|---|---|---|
siteFilter.mode | string | "all" | Filter mode: "all", "whitelist", or "blacklist". |
siteFilter.whitelist | array | [] | URL patterns for whitelist mode. |
siteFilter.blacklist | array | [] | URL patterns for blacklist mode. |
AI Chat Guard
| Setting | Type | Default | Description |
|---|---|---|---|
aiChatGuard.enabled | boolean | true | Master toggle for AI chat protection. |
aiChatGuard.autoRedactOnAiSites | boolean | true | Auto-redact text before it can be submitted to AI services. |
aiChatGuard.blockSubmitUntilRedacted | boolean | true | Block form submission until redaction completes on AI chat sites. |
aiChatGuard.interceptUploads | boolean | true | Intercept file uploads on AI chat sites for scanning. |
aiChatGuard.customAiDomains | array | [] | Additional domains to treat as AI chat services. |
aiChatGuard.showAiBanner | boolean | true | Show a protection banner on detected AI chat sites. |
Auto-Redact
| Setting | Type | Default | Description |
|---|---|---|---|
autoRedact.enabled | boolean | false | Auto-redact PII as users type or paste. Off by default. |
autoRedact.debounceMs | number | 2500 | Debounce delay in ms before redacting after typing pauses. |
autoRedact.onPaste | boolean | true | Immediately redact pasted content (no debounce). |
autoRedact.onlyOnAiSites | boolean | true | Only auto-redact on detected AI chat sites. |
Submit Guard
| Setting | Type | Default | Description |
|---|---|---|---|
submitGuard.enabled | boolean | true | Master toggle for submit interception. |
submitGuard.interceptEnter | boolean | true | Intercept Enter key in textareas. |
submitGuard.interceptFormSubmit | boolean | true | Intercept HTML form submit events. |
submitGuard.interceptSendButtons | boolean | true | Intercept click on send/submit buttons. |
submitGuard.onlyOnAiSites | boolean | true | Only active on detected AI chat sites. |
Upload Guard
| Setting | Type | Default | Description |
|---|---|---|---|
uploadGuard.enabled | boolean | true | Master toggle for upload interception. |
uploadGuard.onlyOnAiSites | boolean | true | Only active on detected AI chat sites. |
uploadGuard.confirmBeforeUpload | boolean | true | Show confirmation dialog before allowing upload. |
Providers
Local AI (Gemini Nano)
| Setting | Type | Default | Description |
|---|---|---|---|
providers.local_ai.enabled | boolean | true | Use Gemini Nano when available. |
Regex
| Setting | Type | Default | Description |
|---|---|---|---|
providers.regex.enabled | boolean | true | Always-available regex fallback. |
Azure PII
| Setting | Type | Default | Description |
|---|---|---|---|
providers.azure_pii.enabled | boolean | false | Enable Azure AI Language PII detection. |
providers.azure_pii.endpoint | string | "" | Azure Language resource endpoint URL. |
providers.azure_pii.apiKey | string | "" | API key (Direct mode only). |
providers.azure_pii.apiVersion | string | "2023-04-01" | API version string. |
providers.azure_pii.language | string | "en" | Detection language code. |
providers.azure_pii.domain | string | "none" | Detection domain: "none" (general) or "phi" (health). |
providers.azure_pii.categoriesFilter | array | [] | Entity categories to detect (empty = all). |
providers.azure_pii.connectionMode | string | "direct" | "direct" (API key) or "enterprise" (OAuth2 PKCE). |
providers.azure_pii.tenantId | string | "" | Azure AD tenant ID (Enterprise mode). |
providers.azure_pii.clientId | string | "" | App Registration client ID (Enterprise mode). |
providers.azure_pii.nativeDocEnabled | boolean | false | Enable server-side document redaction via Azure. |
providers.azure_pii.sourceContainerUrl | string | "" | Blob storage URL for source uploads. |
providers.azure_pii.targetContainerUrl | string | "" | Blob storage URL for redacted results. |
providers.azure_pii.redactionPolicy | string | "entityMask" | Redaction policy: "characterMask", "entityMask", or "noMask". |
Purview DLP
| Setting | Type | Default | Description |
|---|---|---|---|
providers.purview.enabled | boolean | false | Enable Microsoft Purview DLP. |
providers.purview.tenantId | string | "" | Azure AD tenant ID. |
providers.purview.clientId | string | "" | App Registration client ID. |
providers.purview.clientSecret | string | "" | Client secret for application auth. |
providers.purview.accessToken | string | "" | Pre-obtained access token (alternative). |
providers.purview.graphEndpoint | string | "https://graph.microsoft.com/beta" | Graph API endpoint URL. |
providers.purview.useDelegatedAuth | boolean | false | Use delegated (user) auth instead of application auth. |
providers.purview.dlpPolicyId | string | "" | Specific DLP policy ID to evaluate. |
providers.purview.sensitiveInfoTypes | array | [] | Sensitive info type IDs to detect. |
providers.purview.gateEnabled | boolean | false | Enable Classification Gate (pre-upload label check). |
providers.purview.blockedLabels | array | [] | Sensitivity label names to block (e.g., ["PROTECTED","SECRET"]). |
providers.purview.warnLabels | array | [] | Sensitivity label names to warn but allow. |
providers.purview.blockedSensitivity | number | 0 | Block labels with sensitivity >= this value (0 = disabled). |
providers.purview.blockEncrypted | boolean | true | Block files with encryption/RMS markers. |
providers.purview.checkViaApi | boolean | false | Also evaluate via Graph API (requires auth). |
Hosted Models
| Setting | Type | Default | Description |
|---|---|---|---|
providers.hosted | array | [] | Array of hosted model configurations. Each entry follows the schema described in the Hosted Models section. |
Debug & Misc
| Setting | Type | Default | Description |
|---|---|---|---|
debugMode | boolean | false | Output detailed diagnostics to the browser console. |
configUrl | string | "" | Remote configuration URL (for small-business deployment without MDM). |
schemaVersion | number | 3 | Settings schema version. Auto-migrated on load. Do not set manually. |
Troubleshooting
Debug Mode
Enable debug logging to get detailed diagnostics from every subsystem:
- Open the extension Settings page
- Scroll to the Troubleshooting section
- Toggle Debug mode on
- All subsystems will output
[LLSS:subsystem]prefixed messages to the console
Subsystem labels include: nano, regex, vector, pipeline, azure-auth, azure-pii, purview-gate, document, offscreen, content-script, and more.
Error-level messages (LLSS_debug.error()) are always output regardless of debug mode, so critical failures are always visible.
Console Locations
Chrome extensions have three separate console contexts. You need to check the right one:
| Console | How to Open | What It Shows |
|---|---|---|
| Page Console | Right-click page → Inspect → Console tab | Content script logs, redact button interactions, auto-redact, submit/upload guard events, AI chat guard banner |
| Service Worker Console | chrome://extensions → find extension → click "service worker" link |
Background script logs, pipeline execution, provider availability, Azure auth, token refresh, alarm events, managed config loading |
| Offscreen Console | chrome://extensions → find extension → click "offscreen.html" link (when active) |
PDF rendering, Tesseract.js OCR, pixel redaction, canvas operations, worker pool status |
Common Errors & Solutions
| Error | Cause | Solution |
|---|---|---|
No redaction providers available |
All configured providers are disabled or unavailable | Ensure at least providers.regex.enabled is true. Check that Gemini Nano flags are enabled if using AI mode. |
Provider 'local_ai' is not available |
Gemini Nano is not enabled or the model is not downloaded | Enable Chrome flags (#optimization-guide-on-device-model and #prompt-api-for-gemini-nano), restart Chrome, and wait for the model to download. |
Token exchange failed: HTTP 400 |
Azure AD app registration misconfigured | Verify redirect URI matches chrome.identity.getRedirectURL(). Ensure "Allow public client flows" is enabled. Check client ID and tenant ID. |
State mismatch — possible CSRF attack |
Auth flow interrupted or replayed | Try signing in again. If persistent, clear extension data and re-authenticate. |
Token refresh failed — sign in again |
Refresh token expired or revoked | Sign out and sign in again. Check Azure AD Conditional Access policies for session lifetime limits. |
Not signed in to Azure AD |
Enterprise mode requires OAuth2 but user has not signed in | Open the popup or settings page and click the Azure AD Sign In button. |
No API key configured |
Direct mode Azure PII requires an API key | Enter the API key from Azure portal in Settings → Providers → Azure PII → API Key. |
Document processing failed |
No provider could handle the document format | Check the pipeline steps in the error message. Ensure vendor/setup.sh has been run for PDF/image support. Check the offscreen console for OCR errors. |
ZIP processing not available |
ZIP parser module not loaded | Ensure all extension files are intact. Reload the extension from chrome://extensions. |
Offscreen document not available |
Chrome did not create the offscreen document | Reload the extension. Check chrome://extensions for error badges. Ensure the offscreen permission is in manifest.json. |
Graph API evaluation failed: 403 |
Purview API permissions insufficient | Ensure the app registration has the required Graph API permissions and admin consent has been granted. |
Performance Issues
- Slow OCR: Tesseract.js OCR can take 5-15 seconds per page. This is normal for on-device processing. Consider Azure native documents for faster processing of large PDFs.
- High memory on large PDFs: Each page is rendered as a full canvas. Close other tabs if Chrome runs low on memory. The worker pool limits concurrent operations.
- Debounce too aggressive: If auto-redact interferes with typing, increase
autoRedact.debounceMs(default 2500ms). - Extension slows page load: Ensure
siteFilteris configured to limit activation to relevant sites rather than running on all pages. - Gemini Nano slow on first use: The model is loaded on demand. First inference may take a few seconds; subsequent calls are faster within the session timeout window.
Security & Privacy
Design Principles
- Local-first: All detection and redaction runs on-device by default. No data leaves the browser unless a cloud provider is explicitly configured by the user or administrator.
- Zero stored PII: The extension never persists PII to disk. Text is processed in memory and immediately replaced with redaction tokens.
- No background data collection: The extension does not maintain persistent connections, background sync, or any form of data exfiltration channel.
No Telemetry
- No analytics, usage metrics, or crash reports are collected
- No phone-home requests to any server
- No user tracking, cookies, or fingerprinting
- Third-party libraries (pdf.js, Tesseract.js) run entirely locally
- Statistics (redaction counts) are stored in
chrome.storage.local— never transmitted
Provably Secure Redaction
- Text redaction: PII is replaced with bracketed tokens (e.g.,
[REDACTED_EMAIL]). The original text is overwritten in the DOM. - Image redaction: Black rectangles are drawn directly over PII regions on the canvas. There is no hidden text layer. The original pixel data is permanently destroyed.
- PDF flattening: Redacted PDFs are flattened to images to prevent extraction of a hidden text layer from the original PDF structure.
- DOCX XML replacement: PII is replaced within the XML nodes of the DOCX file. The redacted document is a valid Word file with tokens in place of PII.
Enterprise Mode Security
- OAuth2 PKCE: No client secrets stored in the extension. PKCE prevents authorization code interception.
- Session-only tokens: All OAuth2 tokens are stored in
chrome.storage.session(memory-only, cleared on browser close). - Managed identity: Azure native document processing uses managed identity for storage access. No SAS tokens or storage keys are stored in the extension.
- Locked settings: Administrators can lock critical settings via
_lockedmap so users cannot disable protection. - Classification gate: Prevents classified documents from being uploaded, even if the user attempts to bypass redaction.
- API key storage: In Direct mode, API keys are stored in
chrome.storage.local(per-profile, not synced). Enterprise mode eliminates API keys entirely.