How PingPuffin's Monitoring System Works
Version: 1.5
Last updated: February 20, 2026
Table of Contents
- Overview
- Check Intervals and Frequency
- Error Detection
- Recovery and Status Changes
- Manual Updates
- Expected HTTP Status Codes
- Protection Against Monitor Failures
- Notification System
- Public Status Pages
- Automatic Dashboard Updates
- Data Collection and Storage
- Technical Specifications
- Privacy and Security
- System Reliability
- Common Scenarios and Examples
Overview
PingPuffin monitors HTTP and HTTPS endpoints 24/7 to ensure your websites are available. Our system is built with a focus on reliability, precision, and transparency.
Key Features
- ✅ Automatic checks every 3 minutes – all monitors are checked on a regular schedule
- ✅ 2-check verification to avoid false alarms
- ✅ 24/7/365 monitoring without breaks
- ✅ Notifications via email, Slack, SMS, WhatsApp, and webhook (downtime, recovery, warnings, SSL expiry, slow response)
- ✅ SSL certificate alerts – notified 8 days and 2 days before your certificate expires
- ✅ Slow response alerts – optional alert when your site responds too slowly (Settings)
- ✅ Public status pages – share a status page with your own name and logo
- ✅ Instant recovery when your site is back
- ✅ Manual updates for quick verification
- ✅ Protection against internal errors in the monitor system
Why Transparency Matters
We believe in openness about how our monitoring works. This document explains exactly how we detect downtime, how we avoid false alarms, and how we ensure you get notified as quickly as possible when there's a real problem.
Check Intervals and Frequency
Automatic Checks
All active monitors are checked automatically every 3 minutes. Checks run in parallel so all your monitors are checked quickly and evenly.
- Checks run at fixed intervals around the clock
- No breaks, no weekends, no holidays
- You don’t need to do anything – monitoring runs automatically
Manual Updates
You can always trigger an instant check via the "Update now" button in your dashboard:
- Result shown immediately
- Bypasses 2-check threshold for quick feedback
- Useful after deployments or configuration changes
Coverage
- Availability: 24/7/365
- Parallel checks: All monitors checked simultaneously
- Timeout: Standard 30 seconds (configurable)
- Maximum redirects: Up to 5 follow requests
Error Detection
2-Check Verification System
To avoid false alarms, we require 2 consecutive failures before marking a site as down.
How It Works
First Failure (00:00):
- Failure counter set to 1
- Status remains unchanged (e.g., UP)
- No notification sent
- System logs failure for internal monitoring
Second Failure (00:05):
- Failure counter updated to 2
- Status changes to DOWN
- Incident created automatically
- Notification sent to all configured channels
Total time: ~5-10 minutes from first failure to DOWN status.
Why 2 Checks?
Transient network problems (DNS blips, brief timeouts, temporary server errors) occur even on stable sites. By requiring 2 failures:
- ✅ We eliminate false alarms from brief problems
- ✅ We confirm there's a real problem
- ✅ We improve user trust in notifications
What Counts as a Failure?
The following situations are marked as failures:
HTTP Error Codes
- 4xx Client Errors: 400, 403, 404, 405, etc. (unless explicitly allowed)
- 5xx Server Errors: 500, 502, 503, 504, etc.
Network Errors
- Timeout: No response within timeout period (default: 30 seconds)
- Connection Refused: Server actively rejects connections
- DNS Failure: Cannot resolve domain name
- Network Unreachable: Host not available on network
SSL/TLS Errors
- Invalid Certificate: Certificate is invalid
- Expired Certificate: Certificate has expired
- Untrusted Certificate: Certificate not from trusted CA
- Hostname Mismatch: Certificate hostname doesn't match URL
Slow response time
- Response time over threshold: If you have enabled “Notify on slow response time” (Settings), a check counts as a failure when response time exceeds your threshold (5–15 seconds). The same 2-check logic applies: two consecutive slow checks result in down status and a notification.
Handling Different Error Types
Important: ALL error types count toward the failure counter.
Example:
Check 1: HTTP 500 → Failure counter: 1, Status: UP (waiting for confirmation)
Check 2: Timeout → Failure counter: 2, Status: DOWN (confirmed failure)
Rationale:
- If a server switches between different error types, it indicates instability
- It's not less serious if the error type changes
- Any failure means the site is not functioning correctly
Recovery and Status Changes
Instant Recovery
When your site comes back, we react immediately – no 2-check threshold for recovery.
Recovery flow:
Status: DOWN
Check 1: Site responds with HTTP 200 → Failure counter reset, Status: UP
Result: Instant recovery, notification sent
Why instant recovery?
- ✅ Users want quick feedback when their site is back
- ✅ No reason to wait for confirmation that something works
- ✅ Best practice in monitoring
- ✅ Reduces worry and waiting time
Status States
🟢 UP (Online)
Meaning:
- Site responds with an expected HTTP status code (default: 2xx and common 3xx)
- Response time within acceptable range
- No errors detected
Notifications:
- Sent on recovery from DOWN status
🔴 DOWN (Offline)
Meaning:
- Site failed 2+ consecutive checks
- Incident created and tracked
- All configured notification channels alerted
Duration:
- Recorded from first DOWN check to recovery
- Shown in incident log with precise duration
🟡 WARNING (Warning)
Meaning:
- Site responds but with warnings
- Examples: Slow response time, Cloudflare challenge detected
- Monitoring continues normally
Notifications:
- Can be configured per user
🔵 REDIRECT (Redirect)
Meaning:
- Permanent redirect (301) detected to another URL
- Site is functional but URL has changed
- You can choose to update URL or continue monitoring original
Notifications:
- Can be configured per user
Cloudflare-Protected Sites
Automatic Handling:
PingPuffin automatically handles sites behind Cloudflare. Behaviour depends on the HTTP status code:
5xx (e.g. 503 Service Unavailable): All server errors (500, 502, 503, 504, etc.) are always marked as "Down" and trigger notifications and downtime in the chart – even when the response contains Cloudflare branding. Cloudflare often proxies 503 from your origin and wraps it in its own error page; we interpret 5xx as the service being unavailable.
4xx (e.g. 403 Forbidden, 429 Too Many Requests): If the server responds with 403 or 429 and we detect Cloudflare (headers or content), the monitor is marked "Problematic" (warning) instead of "Down". This usually means Cloudflare bot protection or rate limiting – we cannot tell if your site is actually down, only that protection is blocking our check.
Why "Problematic" for 4xx with Cloudflare?
If we get an HTTP response (e.g. 403), the server is reachable. Uptime monitoring is about availability – for 4xx with Cloudflare we assume protection is blocking the check, not that the site itself is down. We therefore show "Problematic" so you know protection is active.
Optional: Configure Cloudflare for Better Monitoring
If you want to avoid "Problematic" status for your Cloudflare-protected site, you can:
-
WAF Rules: Create a rule that allows requests from PingPuffin's User-Agent:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36 -
Browser Integrity Check: Consider disabling this for monitoring IPs
-
Rate Limiting: Adjust rate limiting so monitoring requests aren't blocked
PingPuffin's Monitoring IP: 188.245.198.146 (for whitelisting if needed)
Manual Updates
User-Initiated Checks
You can always trigger an instant check via the "Update now" button in the dashboard.
Behavior:
- Bypass 2-check threshold: Result shown and applied immediately
- Instant status update: If status changes, it updates immediately
- "Last checked" updates: The dashboard shows the new check time right away; in the overview you may see "Updated just now" for the monitor you refreshed
- Notification: Sent if status changes
- Failure counter updated: Counter updated based on result
Use Cases
- ✅ Quick verification after deployment of fixes
- ✅ Testing new monitor configuration
- ✅ Instant status check without waiting for the next automatic check
- ✅ Debugging connection problems
Example:
00:00 - Automatic check fails → Failure counter: 1, Status: UP
00:02 - You click "Update now" → Check fails → Status: DOWN instantly
Result: Manual check skips 2-check threshold for quick feedback
Expected HTTP Status Codes
A monitor is considered UP when the server returns an HTTP status code that you have marked as acceptable. By default, PingPuffin treats these codes as success:
Default accepted codes
2xx Success
- 200 OK
- 201 Created
- 202 Accepted
- 203 Non-Authoritative Information
- 204 No Content
- 205 Reset Content
- 206 Partial Content
3xx Redirects (after following redirects, the final response is evaluated)
- 301 Moved Permanently
- 302 Found
- 303 See Other
- 304 Not Modified
- 307 Temporary Redirect
- 308 Permanent Redirect
Any other status code (e.g. 4xx or 5xx) is treated as a failure unless you explicitly add it to your monitor’s expected status codes.
Custom expected status codes
You can define which codes count as success per monitor. For example:
- 401 Unauthorized for a login or protected page that correctly returns “auth required”
- 403 Forbidden if that response is expected for the URL you monitor
- 404 Not Found only if you intentionally monitor a “not found” page
When you create or edit a monitor, you can set a comma-separated list of expected codes (e.g. 200,201,401). If you leave it at the default, the full list above is used.
Where this applies
- All HTTP(S) uptime checks use this rule (automatic checks and manual “Update now”).
- Status is determined by comparing the final response code (after redirects) to your expected list.
- See Technical Specifications for timeouts, redirect limits, and other request details.
Protection Against Monitor Failures
Internal vs. External Errors
It's critical to distinguish between errors in your site and errors in our monitor system.
Site Errors (Monitored)
These errors from your site count as failures:
- ✅ HTTP 500 from target site → Counts as failure
- ✅ Timeout connecting to target site → Counts as failure
- ✅ DNS error for target domain → Counts as failure
Monitor System Errors (Protected)
These errors in PingPuffin's code do NOT mark your site as down:
- ❌ PHP exception in PingPuffin code → Does NOT mark site as down
- ❌ Database connection error → Does NOT mark site as down
- ❌ Internal logic error → Does NOT mark site as down
Administrator Alerting
When the monitor system fails:
Logging:
- Critical errors logged with full stack trace
- Timestamp and monitor ID recorded
- All details saved for debugging
Email Alarm:
- Email sent to system administrator
- Contains error message, stack trace, and context
- Rate-limited: Maximum 1 email per hour per unique error
- Prevents email flooding during system problems
Database:
- Status remains unchanged (no false downs)
- No incident created
- Users not affected
Example:
Monitor checker runs
→ Internal error detected in monitor system
→ Error logged with full context
→ Email sent to system administrator
→ Database NOT updated
→ Your site status remains unchanged
Notification System
When Notifications Are Sent
- Downtime (DOWN): After 2 consecutive failures confirmed (~5–10 min)
- Recovery (UP): Instantly when your site comes back from DOWN
- Warning and redirect: Configurable per user (redirect, warning)
- Slow response: If you’ve enabled it in Settings – two consecutive slow checks result in DOWN status and a notification
- SSL certificate expiring soon: You’re notified 8 days before and 2 days before expiry (same channels as other notifications)
- Manual update: Instant notification if status changes when you click “Update now”
Notification Channels
You can enable one or more channels under Integrations (Apps). All enabled channels receive the same events (downtime, recovery, SSL expiry, slow response, etc.).
- Direct email notifications
- Contains: site name, status, error message, timestamp, and link to dashboard
Slack
- Message to your configured channel or DM
- Formatted with colors (red = down, green = up) and link to monitor
SMS
- Short message to your mobile number for downtime, recovery, SSL expiry, and slow response
- Requires adding and enabling SMS under Integrations
- Message to your WhatsApp number for downtime, recovery, SSL expiry, and slow response
- Requires adding and enabling WhatsApp under Integrations
Webhook
- POST request to your own endpoint with JSON (status, response time, error message, etc.)
- Useful for integrating with other systems
Notification Snoozing
You can temporarily disable notifications (snooze) for 24 hours:
During Snooze:
- ✅ Monitoring continues normally
- ✅ Status updates in dashboard
- ❌ No notifications sent to any channel
- ⏰ Automatically un-snoozed after 24 hours
Use Cases:
- Planned maintenance
- Known issue during rollout
- Temporary shutdown
Public Status Pages
For each monitor (or group) you can show a public status page that anyone can open without logging in. The page shows current status (up/down/warning) and history.
- Customization: You can set a display name and logo (URL) for the group so the status page matches your brand. Changes apply to both the dashboard and the public page.
- Language: The status page can be shown in Danish, English, or Spanish (based on the visitor’s settings).
- Share the link: You get a stable link to the status page to share with users or embed on your own site.
Automatic Dashboard Updates
Auto-Refresh Mechanism
Your dashboard updates automatically every 30 seconds with the latest data and does not trigger new checks – you always see the latest status without reloading the page.
What Gets Updated?
Dashboard always shows latest data from the last automatic check:
- 🎨 Status indicator: Colored badge (green/red/yellow/blue)
- ⏰ Last checked: Precise timestamp of last check
- ⚡ Response time: Response time in milliseconds
- 🔢 Failure counter: Number of consecutive failures
- 📊 Incident info: Active incidents and duration
Important: Auto-refresh respects 2-check logic because it only shows data from automatic checks – it does not trigger new checks.
Data Collection and Storage
Check Records
Every single check is saved in the database with the following information:
- Unique ID for check
- Reference to monitor
- Timestamp (when check was performed)
- HTTP status code or error type
- Response time in milliseconds
- Success/failure status
- Error message (if relevant)
- Redirect information (if relevant)
- SSL error details (if relevant)
Usage:
- Uptime percentage calculations
- Historical graphs and reports
- Error analysis and debugging
- Performance tracking over time
Incident Tracking
When status changes to DOWN, an incident is created with the following data:
- Unique ID for incident
- Reference to monitor
- Start timestamp
- End timestamp (when resolved)
- Total duration
Features:
- Automatic creation on DOWN status
- Continuous duration calculation
- Automatic resolution on recovery
- Complete history maintained
- Exportable to CSV
Activity Log
All significant events are logged:
- ✅ Status changes (UP → DOWN, DOWN → UP, etc.)
- ✅ Manual checks performed by users
- ✅ Configuration changes
- ✅ URL updates
- ✅ Metadata updates
Functionality:
- Exportable to CSV format
- Searchable and filterable
- Shows details for each event
- Timestamps on all entries
Technical Specifications
HTTP Request Parameters
When PingPuffin checks your site, the following request is sent:
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Accept: */*
Connection: close
[Optional: Authorization header for Basic Auth]
Note: We use a standard Chrome user agent to avoid being blocked by sites that filter custom user agents.
Configuration:
- Timeout: Can be configured per monitor (default: 30 seconds)
- Follow Redirects: Yes, up to 5 redirects maximum
- SSL Verification: Enabled (validates certificates)
- Connection Reuse: Disabled (fresh connection for each check)
Expected HTTP Status Codes
Monitors consider a check successful when the response status is in that monitor’s expected status codes list. By default we accept a broad set of 2xx and 3xx codes. For the full list and how to customize it, see Expected HTTP Status Codes.
Response Time Measurement
What's Measured:
- DNS lookup time
- Connection time (TCP handshake)
- SSL handshake time (if HTTPS)
- Time to first byte (TTFB)
What's NOT Measured:
- Body download time (we only read headers)
- JavaScript execution time
- Asset loading time
Storage:
- Measured in milliseconds
- Saved at each check
- Used for performance tracking
- Shown in dashboard
Advanced Monitoring Settings
For advanced users, we offer:
HTTP Method
- GET: Standard method
- POST: For endpoints that require POST
Request Body
- Send JSON or form data with POST requests
- Useful for API endpoints that require specific data
Basic Authentication
- Username and password for protected endpoints
- Passwords encrypted with AES-256-CBC
- Never stored in plain text
Future Features
- Custom headers
- Request params
- Advanced authentication (OAuth, Bearer tokens)
Server Information
Public IP Address
PingPuffin's monitoring server uses the following public IP address to perform checks:
IP Address: 188.245.198.146
If you need to whitelist PingPuffin's IP address in your firewall or server configuration, you can use this IP address.
Note: The IP address may change during server migrations or infrastructure updates. We recommend using the User-Agent header for identification instead of IP-based rules, if possible.
User-Agent Identification:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Privacy and Security
Data Encryption
Sensitive Credentials:
- All passwords (Basic Auth) encrypted on storage
- Algorithm: AES-256-CBC (industry standard)
- Key: Stored securely in environment variables
- IV: Unique initialization vector per encryption
- Decryption only happens in memory during checks
- Never exposed in logs or API responses
Data Access
Access Control:
- Check results only visible to monitor owner
- No data shared with third parties
- Activity log only exportable by owner
- Secure API endpoints with authentication
Database Security:
- Prepared statements (prevents SQL injection)
- Session-based authentication
- CSRF protection on all forms
HTTPS Enforcement
SSL/TLS:
- All monitor checks support HTTPS
- SSL certificate validation enabled
- Warns on certificate problems
- Detects expired certificates
Dashboard:
- Always accessed via HTTPS
- Secure cookies (HttpOnly, Secure flags)
- HSTS headers recommended
System Reliability
Monitor System Health
Our monitor system monitors itself:
Error Detection:
- Automatic detection of internal errors
- Full logging of all exceptions
- Stack traces for debugging
Administrator Alerts:
- Critical errors emailed to system administrator
- Rate-limited to avoid spam
- Details included for quick resolution
Automatic Recovery:
- The system continues checking other monitors even if one fails
- No cascade failures across monitors
- Database transactions ensure data integrity
Uptime Goals
We aim for the following reliability:
Monitor System Uptime:
- Goal: 99.9% (less than 9 hours downtime per year)
- Monitored: Via internal logs and system metrics
Check Execution Rate:
- Goal: 99.5% success rate for check execution
- Monitored: Error rate in logs
Check reliability: Monitoring runs continuously; on technical issues the administrator is alerted so the system stays stable.
Common Scenarios and Examples
Scenario 1: Transient Network Blip
Situation: Brief network problem, site is actually up.
00:00 - Check fails (timeout) → Failure counter: 1, Status: UP
00:05 - Check succeeds → Failure counter: 0, Status: UP
Result:
✅ No notification sent
✅ No status change
✅ No false alarm
Why it works: 2-check threshold catches brief problems.
Scenario 2: Real Downtime
Situation: Server is really down (e.g., hosting problem).
00:00 - Check fails (HTTP 500) → Failure counter: 1, Status: UP
→ System logs first failure for internal monitoring
00:05 - Check fails (timeout) → Failure counter: 2, Status: DOWN
→ Incident created automatically
→ Notification sent via email/Slack/webhook
00:10 - Check fails (timeout) → Failure counter: 3, Status: DOWN
→ Incident duration updated continuously
Result:
✅ DOWN status confirmed at 00:05
✅ Notification sent ~5 minutes after first failure
✅ Different error types (500 + timeout) both count
Why it works: Two consecutive failures confirm real problem.
Scenario 3: Quick Recovery
Situation: Site down, comes back quickly.
00:00 - Status is DOWN (from previous failure)
00:05 - Check succeeds (HTTP 200) → Failure counter: 0, Status: UP
→ Incident marked as resolved automatically
→ Recovery notification sent
Result:
✅ Instant recovery on first successful check
✅ Incident duration calculated (00:00 to 00:05 = 5 min)
✅ You're informed quickly about recovery
Why it works: No 2-check threshold for recovery.
Scenario 4: Manual Refresh During First Failure
Situation: Automatic check failed once, user wants to verify.
00:00 - Automatic check fails → Failure counter: 1, Status: UP
→ Status remains UP (waiting for confirmation)
00:02 - You click "Update now" → Check fails → Status: DOWN immediately
→ Manual check bypasses 2-check threshold for quick feedback
→ Notification sent
Result:
✅ Manual check gives instant feedback
✅ Status updates without waiting for next automatic check
✅ Useful for debugging and verification
Why it works: Manual checks are designed for instant feedback.
Scenario 5: Monitor System Error
Situation: Internal error in PingPuffin's own code.
00:00 - Monitor system runs automatic check
→ Internal error detected
→ Error caught automatically
Logging:
→ Error logged with full context for internal monitoring
→ Full technical information saved for debugging
Administrator Alarm:
→ Email sent to system administrator (max 1 per hour)
→ Contains error details and context
Database:
→ No update to your site status
→ Your site status remains unchanged
→ Failure counter not affected
Result:
✅ Monitor error does NOT affect your site status
✅ Administrator alerted to fix problem
✅ No false DOWN status
Why it works: Distinction between monitor errors and site errors.
Scenario 6: Different Error Types Consecutively
Situation: Server unstable, different errors each time.
00:00 - Check fails (HTTP 500) → Failure counter: 1, Status: UP
00:05 - Check fails (Connection timeout) → Failure counter: 2, Status: DOWN
00:10 - Check fails (HTTP 503) → Failure counter: 3, Status: DOWN
Result:
✅ Different error types ALL count
✅ Status DOWN after 2 failures (regardless of type)
✅ Indicates unstable server (maybe worse than one consistent error)
Why it works: Any error means site not functioning correctly.
Frequently Asked Questions
How quickly do I get notified of downtime?
Automatic check: ~5-10 minutes after first failure (requires 2 failures).
Manual check: Instantly if you update manually.
Can I get false alarms?
Very rarely. The 2-check system eliminates most brief problems. If you get an alarm, there's almost always a real problem.
What if my server is temporarily slow?
If response time exceeds timeout (default 30 sec), it counts as failure. You can increase timeout value for your monitor.
How is planned maintenance handled?
Use the "Snooze" function to disable notifications for 24 hours. Monitoring continues, but you get no alarms.
Can I see history for all checks?
Yes, the activity log shows all checks and status changes. You can also export to CSV.
What happens if PingPuffin itself goes down?
Our monitors run on reliable infrastructure. On critical system errors, administrator is alerted, but your site is NOT marked as down.
Contact & Support
Have questions about how monitoring works?
📧 Email: support@pingpuffin.com
🐛 Bug reports: Via email
📚 Documentation: See documentation section for more information
Changelog
v1.5 (February 20, 2026)
- SMS and WhatsApp added as notification channels (under Integrations)
- SSL certificate alerts: notification 8 days and 2 days before certificate expires (all channels)
- Slow response alerts: option to be notified when the site responds too slowly (Settings, 5–15 sec)
- Public status pages: described customization with display name and logo, and language support
- Technical implementation details removed from user documentation (focus on what you see and can configure)
v1.3 (February 14, 2026)
- Version and documentation date update
v1.2 (February 9, 2026)
- New section: Expected HTTP Status Codes with full default list (200, 201, 202, 203, 204, 205, 206, 301, 302, 303, 304, 307, 308) and customization
- Manual “Update now”: dashboard “Last checked” and overview “Updated just now” feedback documented
v1.1 (January 17, 2026)
- Checks now run in parallel (faster throughput)
- Check interval reduced from 5 to 3 minutes
- Each category can check up to 200 monitors per run
- Documentation updated with parallel processing details
v1.0 (November 21, 2024)
- First version of documentation
- 2-check verification system implemented
- Monitor failure protection added
- Rate-limited administrator alerts
This document is updated continuously. Check "Last updated" at the top to see if there are new versions.