Crawler
Automatically analyze your product with the Pointer CLI to build comprehensive knowledge.
The Pointer Crawler enables you to automatically gather and analyze content from your product, creating a comprehensive knowledge base for AI-powered features.
Prerequisites
- Node.js version 16 or higher
- Access to Pointer dashboard
Installation
Install the Pointer CLI globally using npm:
Verify the installation:
Authentication
Create an API key
Navigate to API Keys
Go to your Keys settings in the Pointer dashboard.
Generate new key
Click Create new key and provide:
- Name: Descriptive identifier (e.g., “CLI Production”)
- Description: Optional context about key usage
- Expiration: Optional expiry date (defaults to never expire)
Copy your secret key
Save the generated key immediately - it won’t be shown again. Keys follow the format:
Configure authentication
Set your secret key using one of these methods:
Environment variables are recommended for security. Command-line options may expose keys in shell history.
Core workflow
Step 1: Initialize your website
Start by adding your website to the crawler configuration:
The interactive prompt will guide you through:
- Entering a friendly name for identification
- Providing your website URL
- Confirming the configuration
Step 2: Scrape your content
Begin the automated content collection:
Choose from interactive options:
- Scraping mode: Headless (fast) or Browser (with authentication)
- Crawl depth: Fast (surface content) or Deep (interactive elements)
- PII protection: Configure sensitivity and redaction settings
The CLI saves your progress automatically. If interrupted, it will offer to resume from where it left off.
Step 3: Upload for analysis
Send your scraped content to Pointer for processing:
The CLI will:
- Display a summary of collected data
- Confirm the upload scope
- Transfer content to your knowledge base
Command reference
Primary commands
Command | Description | Authentication |
---|---|---|
pointer init | Add a website to crawl | Required |
pointer scrape | Collect content from configured websites | Required |
pointer upload | Transfer scraped data to Pointer | Required |
pointer status | Check crawl processing status | Required |
pointer list | View local scraped data | Not required |
pointer cleanup | Remove all local data | Not required |
pointer purge | Delete server-side crawl data | Required |
Global options
Available for all commands:
Option | Description |
---|---|
-s, --secret-key <key> | API secret key (overrides environment variable) |
-v, --version | Display CLI version |
--help | Show command help |
Scraping options
Configure pointer scrape
behavior:
Option | Description | Default |
---|---|---|
--max-pages <number> | Maximum pages to crawl | 200 |
--concurrency <number> | Parallel page processing | 2 |
--fast | Use fast crawl mode | Interactive prompt |
--no-pii-protection | Disable PII detection | PII protection enabled |
--pii-sensitivity <level> | Set detection level (low/medium/high) | Interactive prompt |
--log-level <level> | Logging verbosity | info |
Best practices
Use interactive mode
Use interactive mode
Run commands without options for guided workflows:
The CLI provides clear prompts and smart defaults for all operations.
Secure your credentials
Secure your credentials
- Store API keys in environment variables
- Never commit keys to version control
- Set expiration dates for temporary access
Optimize crawling
Optimize crawling
- Use browser mode only when authentication is required
- Enable PII protection for user-facing applications
- Monitor crawl status before uploading
Manage your data
Manage your data
- Review scraped content with
pointer list
before uploading - Use
pointer cleanup
to remove local data after successful uploads - Keep crawl sessions organized with descriptive website names
Automation examples
While the CLI is designed for interactive use, automation is supported for CI/CD pipelines:
Use automation options carefully. Interactive mode provides safety confirmations and validation that prevent common errors.
Troubleshooting
Authentication errors
If you encounter authentication issues:
- Verify your API key is valid in the dashboard
- Check environment variable is set correctly:
echo $POINTER_SECRET_KEY
- Ensure the key hasn’t expired
- Confirm you have necessary permissions
Crawling interruptions
The crawler automatically saves progress. If interrupted:
Upload limitations
- Maximum 500 pages per upload (API limit)
- Large crawls are automatically truncated
- Use
--max-pages
to control crawl size upfront
Next steps
After successfully crawling and uploading your content:
- View your enriched knowledge base in the Knowledge section
- Configure AI features to leverage the collected data
- Monitor analytics to understand content usage
- Set up regular crawls to keep knowledge current