How PingPuffin's Monitoring System Works

Version: 1.5
Last updated: February 20, 2026

Overview
Check Intervals and Frequency
Error Detection
Recovery and Status Changes
Manual Updates
Expected HTTP Status Codes
Protection Against Monitor Failures
Notification System
Public Status Pages
Automatic Dashboard Updates
Data Collection and Storage
Technical Specifications
Privacy and Security
System Reliability
Common Scenarios and Examples

Overview

PingPuffin monitors HTTP and HTTPS endpoints 24/7 to ensure your websites are available. Our system is built with a focus on reliability, precision, and transparency.

Key Features

✅ Automatic checks every 3 minutes – all monitors are checked on a regular schedule
✅ 2-check verification to avoid false alarms
✅ 24/7/365 monitoring without breaks
✅ Notifications via email, Slack, SMS, WhatsApp, and webhook (downtime, recovery, warnings, SSL expiry, slow response)
✅ SSL certificate alerts – notified 8 days and 2 days before your certificate expires
✅ Slow response alerts – optional alert when your site responds too slowly (Settings)
✅ Public status pages – share a status page with your own name and logo
✅ Instant recovery when your site is back
✅ Manual updates for quick verification
✅ Protection against internal errors in the monitor system

Why Transparency Matters

We believe in openness about how our monitoring works. This document explains exactly how we detect downtime, how we avoid false alarms, and how we ensure you get notified as quickly as possible when there's a real problem.

Check Intervals and Frequency

Automatic Checks

All active monitors are checked automatically every 3 minutes. Checks run in parallel so all your monitors are checked quickly and evenly.

Checks run at fixed intervals around the clock
No breaks, no weekends, no holidays
You don’t need to do anything – monitoring runs automatically

Manual Updates

You can always trigger an instant check via the "Update now" button in your dashboard:

Result shown immediately
Bypasses 2-check threshold for quick feedback
Useful after deployments or configuration changes

Coverage

Availability: 24/7/365
Parallel checks: All monitors checked simultaneously
Timeout: Standard 30 seconds (configurable)
Maximum redirects: Up to 5 follow requests

Error Detection

2-Check Verification System

To avoid false alarms, we require 2 consecutive failures before marking a site as down.

How It Works

First Failure (00:00):

Failure counter set to 1
Status remains unchanged (e.g., UP)
No notification sent
System logs failure for internal monitoring

Second Failure (00:05):

Failure counter updated to 2
Status changes to DOWN
Incident created automatically
Notification sent to all configured channels

Total time: ~5-10 minutes from first failure to DOWN status.

Why 2 Checks?

Transient network problems (DNS blips, brief timeouts, temporary server errors) occur even on stable sites. By requiring 2 failures:

✅ We eliminate false alarms from brief problems
✅ We confirm there's a real problem
✅ We improve user trust in notifications

What Counts as a Failure?

The following situations are marked as failures:

HTTP Error Codes

4xx Client Errors: 400, 403, 404, 405, etc. (unless explicitly allowed)
5xx Server Errors: 500, 502, 503, 504, etc.

Network Errors

Timeout: No response within timeout period (default: 30 seconds)
Connection Refused: Server actively rejects connections
DNS Failure: Cannot resolve domain name
Network Unreachable: Host not available on network

SSL/TLS Errors

Invalid Certificate: Certificate is invalid
Expired Certificate: Certificate has expired
Untrusted Certificate: Certificate not from trusted CA
Hostname Mismatch: Certificate hostname doesn't match URL

Slow response time

Response time over threshold: If you have enabled “Notify on slow response time” (Settings), a check counts as a failure when response time exceeds your threshold (5–15 seconds). The same 2-check logic applies: two consecutive slow checks result in down status and a notification.

Handling Different Error Types

Important: ALL error types count toward the failure counter.

Example:

Check 1: HTTP 500 → Failure counter: 1, Status: UP (waiting for confirmation)
Check 2: Timeout  → Failure counter: 2, Status: DOWN (confirmed failure)

Rationale:

If a server switches between different error types, it indicates instability
It's not less serious if the error type changes
Any failure means the site is not functioning correctly

Recovery and Status Changes

Instant Recovery

When your site comes back, we react immediately – no 2-check threshold for recovery.

Recovery flow:

Status: DOWN
Check 1: Site responds with HTTP 200 → Failure counter reset, Status: UP
Result: Instant recovery, notification sent

Why instant recovery?

✅ Users want quick feedback when their site is back
✅ No reason to wait for confirmation that something works
✅ Best practice in monitoring
✅ Reduces worry and waiting time

Status States

🟢 UP (Online)

Meaning:

Site responds with an expected HTTP status code (default: 2xx and common 3xx)
Response time within acceptable range
No errors detected

Notifications:

Sent on recovery from DOWN status

🔴 DOWN (Offline)

Meaning:

Site failed 2+ consecutive checks
Incident created and tracked
All configured notification channels alerted

Duration:

Recorded from first DOWN check to recovery
Shown in incident log with precise duration

🟡 WARNING (Warning)

Meaning:

Site responds but with warnings
Examples: Slow response time, Cloudflare challenge detected
Monitoring continues normally

Notifications:

Can be configured per user

🔵 REDIRECT (Redirect)

Meaning:

Permanent redirect (301) detected to another URL
Site is functional but URL has changed
You can choose to update URL or continue monitoring original

Notifications:

Can be configured per user

Cloudflare-Protected Sites

Automatic Handling:

PingPuffin automatically handles sites behind Cloudflare. Behaviour depends on the HTTP status code:

5xx (e.g. 503 Service Unavailable): All server errors (500, 502, 503, 504, etc.) are always marked as "Down" and trigger notifications and downtime in the chart – even when the response contains Cloudflare branding. Cloudflare often proxies 503 from your origin and wraps it in its own error page; we interpret 5xx as the service being unavailable.

4xx (e.g. 403 Forbidden, 429 Too Many Requests): If the server responds with 403 or 429 and we detect Cloudflare (headers or content), the monitor is marked "Problematic" (warning) instead of "Down". This usually means Cloudflare bot protection or rate limiting – we cannot tell if your site is actually down, only that protection is blocking our check.

Why "Problematic" for 4xx with Cloudflare?

If we get an HTTP response (e.g. 403), the server is reachable. Uptime monitoring is about availability – for 4xx with Cloudflare we assume protection is blocking the check, not that the site itself is down. We therefore show "Problematic" so you know protection is active.

Optional: Configure Cloudflare for Better Monitoring

If you want to avoid "Problematic" status for your Cloudflare-protected site, you can:

WAF Rules: Create a rule that allows requests from PingPuffin's User-Agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Browser Integrity Check: Consider disabling this for monitoring IPs
Rate Limiting: Adjust rate limiting so monitoring requests aren't blocked

PingPuffin's Monitoring IP: 188.245.198.146 (for whitelisting if needed)

Manual Updates

User-Initiated Checks

You can always trigger an instant check via the "Update now" button in the dashboard.

Behavior:

Bypass 2-check threshold: Result shown and applied immediately
Instant status update: If status changes, it updates immediately
"Last checked" updates: The dashboard shows the new check time right away; in the overview you may see "Updated just now" for the monitor you refreshed
Notification: Sent if status changes
Failure counter updated: Counter updated based on result

Use Cases

✅ Quick verification after deployment of fixes
✅ Testing new monitor configuration
✅ Instant status check without waiting for the next automatic check
✅ Debugging connection problems

Example:

00:00 - Automatic check fails → Failure counter: 1, Status: UP
00:02 - You click "Update now" → Check fails → Status: DOWN instantly
Result: Manual check skips 2-check threshold for quick feedback

Expected HTTP Status Codes

A monitor is considered UP when the server returns an HTTP status code that you have marked as acceptable. By default, PingPuffin treats these codes as success:

Default accepted codes

2xx Success

200 OK
201 Created
202 Accepted
203 Non-Authoritative Information
204 No Content
205 Reset Content
206 Partial Content

3xx Redirects (after following redirects, the final response is evaluated)

301 Moved Permanently
302 Found
303 See Other
304 Not Modified
307 Temporary Redirect
308 Permanent Redirect

Any other status code (e.g. 4xx or 5xx) is treated as a failure unless you explicitly add it to your monitor’s expected status codes.

Custom expected status codes

You can define which codes count as success per monitor. For example:

401 Unauthorized for a login or protected page that correctly returns “auth required”
403 Forbidden if that response is expected for the URL you monitor
404 Not Found only if you intentionally monitor a “not found” page

When you create or edit a monitor, you can set a comma-separated list of expected codes (e.g. 200,201,401). If you leave it at the default, the full list above is used.

Where this applies

All HTTP(S) uptime checks use this rule (automatic checks and manual “Update now”).
Status is determined by comparing the final response code (after redirects) to your expected list.
See Technical Specifications for timeouts, redirect limits, and other request details.

Protection Against Monitor Failures

Internal vs. External Errors

It's critical to distinguish between errors in your site and errors in our monitor system.

Site Errors (Monitored)

These errors from your site count as failures:

✅ HTTP 500 from target site → Counts as failure
✅ Timeout connecting to target site → Counts as failure
✅ DNS error for target domain → Counts as failure

Monitor System Errors (Protected)

These errors in PingPuffin's code do NOT mark your site as down:

❌ PHP exception in PingPuffin code → Does NOT mark site as down
❌ Database connection error → Does NOT mark site as down
❌ Internal logic error → Does NOT mark site as down

Administrator Alerting

When the monitor system fails:

Logging:

Critical errors logged with full stack trace
Timestamp and monitor ID recorded
All details saved for debugging

Email Alarm:

Email sent to system administrator
Contains error message, stack trace, and context
Rate-limited: Maximum 1 email per hour per unique error
Prevents email flooding during system problems

Database:

Status remains unchanged (no false downs)
No incident created
Users not affected

Example:

Monitor checker runs
→ Internal error detected in monitor system
→ Error logged with full context
→ Email sent to system administrator
→ Database NOT updated
→ Your site status remains unchanged

Notification System

When Notifications Are Sent

Downtime (DOWN): After 2 consecutive failures confirmed (~5–10 min)
Recovery (UP): Instantly when your site comes back from DOWN
Warning and redirect: Configurable per user (redirect, warning)
Slow response: If you’ve enabled it in Settings – two consecutive slow checks result in DOWN status and a notification
SSL certificate expiring soon: You’re notified 8 days before and 2 days before expiry (same channels as other notifications)
Manual update: Instant notification if status changes when you click “Update now”

Notification Channels

You can enable one or more channels under Integrations (Apps). All enabled channels receive the same events (downtime, recovery, SSL expiry, slow response, etc.).

Email

Direct email notifications
Contains: site name, status, error message, timestamp, and link to dashboard

Slack

Message to your configured channel or DM
Formatted with colors (red = down, green = up) and link to monitor

SMS

Short message to your mobile number for downtime, recovery, SSL expiry, and slow response
Requires adding and enabling SMS under Integrations

Message to your WhatsApp number for downtime, recovery, SSL expiry, and slow response
Requires adding and enabling WhatsApp under Integrations

Webhook

POST request to your own endpoint with JSON (status, response time, error message, etc.)
Useful for integrating with other systems

Notification Snoozing

You can temporarily disable notifications (snooze) for 24 hours:

During Snooze:

✅ Monitoring continues normally
✅ Status updates in dashboard
❌ No notifications sent to any channel
⏰ Automatically un-snoozed after 24 hours

Use Cases:

Planned maintenance
Known issue during rollout
Temporary shutdown

Public Status Pages

For each monitor (or group) you can show a public status page that anyone can open without logging in. The page shows current status (up/down/warning) and history.

Customization: You can set a display name and logo (URL) for the group so the status page matches your brand. Changes apply to both the dashboard and the public page.
Language: The status page can be shown in Danish, English, or Spanish (based on the visitor’s settings).
Share the link: You get a stable link to the status page to share with users or embed on your own site.

Automatic Dashboard Updates

Auto-Refresh Mechanism

Your dashboard updates automatically every 30 seconds with the latest data and does not trigger new checks – you always see the latest status without reloading the page.

What Gets Updated?

Dashboard always shows latest data from the last automatic check:

🎨 Status indicator: Colored badge (green/red/yellow/blue)
⏰ Last checked: Precise timestamp of last check
⚡ Response time: Response time in milliseconds
🔢 Failure counter: Number of consecutive failures
📊 Incident info: Active incidents and duration

Important: Auto-refresh respects 2-check logic because it only shows data from automatic checks – it does not trigger new checks.

Data Collection and Storage

Check Records

Every single check is saved in the database with the following information:

Unique ID for check
Reference to monitor
Timestamp (when check was performed)
HTTP status code or error type
Response time in milliseconds
Success/failure status
Error message (if relevant)
Redirect information (if relevant)
SSL error details (if relevant)

Usage:

Uptime percentage calculations
Historical graphs and reports
Error analysis and debugging
Performance tracking over time

Incident Tracking

When status changes to DOWN, an incident is created with the following data:

Unique ID for incident
Reference to monitor
Start timestamp
End timestamp (when resolved)
Total duration

Features:

Automatic creation on DOWN status
Continuous duration calculation
Automatic resolution on recovery
Complete history maintained
Exportable to CSV

Activity Log

All significant events are logged:

✅ Status changes (UP → DOWN, DOWN → UP, etc.)
✅ Manual checks performed by users
✅ Configuration changes
✅ URL updates
✅ Metadata updates

Functionality:

Exportable to CSV format
Searchable and filterable
Shows details for each event
Timestamps on all entries

Technical Specifications

HTTP Request Parameters

When PingPuffin checks your site, the following request is sent:

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Accept: */*
Connection: close
[Optional: Authorization header for Basic Auth]

Note: We use a standard Chrome user agent to avoid being blocked by sites that filter custom user agents.

Configuration:

Timeout: Can be configured per monitor (default: 30 seconds)
Follow Redirects: Yes, up to 5 redirects maximum
SSL Verification: Enabled (validates certificates)
Connection Reuse: Disabled (fresh connection for each check)

Expected HTTP Status Codes

Monitors consider a check successful when the response status is in that monitor’s expected status codes list. By default we accept a broad set of 2xx and 3xx codes. For the full list and how to customize it, see Expected HTTP Status Codes.

Response Time Measurement

What's Measured:

DNS lookup time
Connection time (TCP handshake)
SSL handshake time (if HTTPS)
Time to first byte (TTFB)

What's NOT Measured:

Body download time (we only read headers)
JavaScript execution time
Asset loading time

Storage:

Measured in milliseconds
Saved at each check
Used for performance tracking
Shown in dashboard

Advanced Monitoring Settings

For advanced users, we offer:

HTTP Method

GET: Standard method
POST: For endpoints that require POST

Request Body

Send JSON or form data with POST requests
Useful for API endpoints that require specific data

Basic Authentication

Username and password for protected endpoints
Passwords encrypted with AES-256-CBC
Never stored in plain text

Future Features

Custom headers
Request params
Advanced authentication (OAuth, Bearer tokens)

Server Information

Public IP Address

PingPuffin's monitoring server uses the following public IP address to perform checks:

IP Address: 188.245.198.146

If you need to whitelist PingPuffin's IP address in your firewall or server configuration, you can use this IP address.

Note: The IP address may change during server migrations or infrastructure updates. We recommend using the User-Agent header for identification instead of IP-based rules, if possible.

User-Agent Identification:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Privacy and Security

Data Encryption

Sensitive Credentials:

All passwords (Basic Auth) encrypted on storage
Algorithm: AES-256-CBC (industry standard)
Key: Stored securely in environment variables
IV: Unique initialization vector per encryption
Decryption only happens in memory during checks
Never exposed in logs or API responses

Data Access

Access Control:

Check results only visible to monitor owner
No data shared with third parties
Activity log only exportable by owner
Secure API endpoints with authentication

Database Security:

Prepared statements (prevents SQL injection)
Session-based authentication
CSRF protection on all forms

HTTPS Enforcement

SSL/TLS:

All monitor checks support HTTPS
SSL certificate validation enabled
Warns on certificate problems
Detects expired certificates

Dashboard:

Always accessed via HTTPS
Secure cookies (HttpOnly, Secure flags)
HSTS headers recommended

System Reliability

Monitor System Health

Our monitor system monitors itself:

Error Detection:

Automatic detection of internal errors
Full logging of all exceptions
Stack traces for debugging

Administrator Alerts:

Critical errors emailed to system administrator
Rate-limited to avoid spam
Details included for quick resolution

Automatic Recovery:

The system continues checking other monitors even if one fails
No cascade failures across monitors
Database transactions ensure data integrity

Uptime Goals

We aim for the following reliability:

Monitor System Uptime:

Goal: 99.9% (less than 9 hours downtime per year)
Monitored: Via internal logs and system metrics

Check Execution Rate:

Goal: 99.5% success rate for check execution
Monitored: Error rate in logs

Check reliability: Monitoring runs continuously; on technical issues the administrator is alerted so the system stays stable.

Common Scenarios and Examples

Scenario 1: Transient Network Blip

Situation: Brief network problem, site is actually up.

00:00 - Check fails (timeout) → Failure counter: 1, Status: UP
00:05 - Check succeeds → Failure counter: 0, Status: UP

Result:
✅ No notification sent
✅ No status change
✅ No false alarm

Why it works: 2-check threshold catches brief problems.

Scenario 2: Real Downtime

Situation: Server is really down (e.g., hosting problem).

00:00 - Check fails (HTTP 500) → Failure counter: 1, Status: UP
      → System logs first failure for internal monitoring

00:05 - Check fails (timeout) → Failure counter: 2, Status: DOWN
      → Incident created automatically
      → Notification sent via email/Slack/webhook

00:10 - Check fails (timeout) → Failure counter: 3, Status: DOWN
      → Incident duration updated continuously

Result:
✅ DOWN status confirmed at 00:05
✅ Notification sent ~5 minutes after first failure
✅ Different error types (500 + timeout) both count

Why it works: Two consecutive failures confirm real problem.

Scenario 3: Quick Recovery

Situation: Site down, comes back quickly.

00:00 - Status is DOWN (from previous failure)
00:05 - Check succeeds (HTTP 200) → Failure counter: 0, Status: UP
      → Incident marked as resolved automatically
      → Recovery notification sent

Result:
✅ Instant recovery on first successful check
✅ Incident duration calculated (00:00 to 00:05 = 5 min)
✅ You're informed quickly about recovery

Why it works: No 2-check threshold for recovery.

Scenario 4: Manual Refresh During First Failure

Situation: Automatic check failed once, user wants to verify.

00:00 - Automatic check fails → Failure counter: 1, Status: UP
      → Status remains UP (waiting for confirmation)

00:02 - You click "Update now" → Check fails → Status: DOWN immediately
      → Manual check bypasses 2-check threshold for quick feedback
      → Notification sent

Result:
✅ Manual check gives instant feedback
✅ Status updates without waiting for next automatic check
✅ Useful for debugging and verification

Why it works: Manual checks are designed for instant feedback.

Scenario 5: Monitor System Error

Situation: Internal error in PingPuffin's own code.

00:00 - Monitor system runs automatic check
      → Internal error detected
      → Error caught automatically

      Logging:
      → Error logged with full context for internal monitoring
      → Full technical information saved for debugging

      Administrator Alarm:
      → Email sent to system administrator (max 1 per hour)
      → Contains error details and context

      Database:
      → No update to your site status
      → Your site status remains unchanged
      → Failure counter not affected

Result:
✅ Monitor error does NOT affect your site status
✅ Administrator alerted to fix problem
✅ No false DOWN status

Why it works: Distinction between monitor errors and site errors.

Scenario 6: Different Error Types Consecutively

Situation: Server unstable, different errors each time.

00:00 - Check fails (HTTP 500) → Failure counter: 1, Status: UP
00:05 - Check fails (Connection timeout) → Failure counter: 2, Status: DOWN
00:10 - Check fails (HTTP 503) → Failure counter: 3, Status: DOWN

Result:
✅ Different error types ALL count
✅ Status DOWN after 2 failures (regardless of type)
✅ Indicates unstable server (maybe worse than one consistent error)

Why it works: Any error means site not functioning correctly.

Frequently Asked Questions

How quickly do I get notified of downtime?

Automatic check: ~5-10 minutes after first failure (requires 2 failures).
Manual check: Instantly if you update manually.

Can I get false alarms?

Very rarely. The 2-check system eliminates most brief problems. If you get an alarm, there's almost always a real problem.

What if my server is temporarily slow?

If response time exceeds timeout (default 30 sec), it counts as failure. You can increase timeout value for your monitor.

How is planned maintenance handled?

Use the "Snooze" function to disable notifications for 24 hours. Monitoring continues, but you get no alarms.

Can I see history for all checks?

Yes, the activity log shows all checks and status changes. You can also export to CSV.

What happens if PingPuffin itself goes down?

Our monitors run on reliable infrastructure. On critical system errors, administrator is alerted, but your site is NOT marked as down.

Contact & Support

Have questions about how monitoring works?

📧 Email: support@pingpuffin.com
🐛 Bug reports: Via email
📚 Documentation: See documentation section for more information

Changelog

v1.5 (February 20, 2026)

SMS and WhatsApp added as notification channels (under Integrations)
SSL certificate alerts: notification 8 days and 2 days before certificate expires (all channels)
Slow response alerts: option to be notified when the site responds too slowly (Settings, 5–15 sec)
Public status pages: described customization with display name and logo, and language support
Technical implementation details removed from user documentation (focus on what you see and can configure)

v1.3 (February 14, 2026)

Version and documentation date update

v1.2 (February 9, 2026)

New section: Expected HTTP Status Codes with full default list (200, 201, 202, 203, 204, 205, 206, 301, 302, 303, 304, 307, 308) and customization
Manual “Update now”: dashboard “Last checked” and overview “Updated just now” feedback documented

v1.1 (January 17, 2026)

Checks now run in parallel (faster throughput)
Check interval reduced from 5 to 3 minutes
Each category can check up to 200 monitors per run
Documentation updated with parallel processing details

v1.0 (November 21, 2024)

First version of documentation
2-check verification system implemented
Monitor failure protection added
Rate-limited administrator alerts

This document is updated continuously. Check "Last updated" at the top to see if there are new versions.

How PingPuffin's Monitoring System Works

Table of Contents

Overview

Key Features

Why Transparency Matters

Check Intervals and Frequency

Automatic Checks

Manual Updates

Coverage

Error Detection

2-Check Verification System

How It Works

Why 2 Checks?

What Counts as a Failure?

HTTP Error Codes

Network Errors

SSL/TLS Errors

Slow response time

Handling Different Error Types

Recovery and Status Changes

Instant Recovery

Status States

🟢 UP (Online)

🔴 DOWN (Offline)

🟡 WARNING (Warning)

🔵 REDIRECT (Redirect)

Cloudflare-Protected Sites

Manual Updates

User-Initiated Checks

Use Cases

Expected HTTP Status Codes

Default accepted codes

Custom expected status codes

Where this applies

Protection Against Monitor Failures

Internal vs. External Errors

Site Errors (Monitored)

Monitor System Errors (Protected)

Administrator Alerting

Notification System

When Notifications Are Sent

Notification Channels

Email

Slack

SMS

WhatsApp

Webhook

Notification Snoozing

Public Status Pages

Automatic Dashboard Updates

Auto-Refresh Mechanism

What Gets Updated?

Data Collection and Storage

Check Records

Incident Tracking

Activity Log

Technical Specifications

HTTP Request Parameters

Expected HTTP Status Codes

Response Time Measurement

Advanced Monitoring Settings

HTTP Method

Request Body

Basic Authentication

Future Features

Server Information

Public IP Address

Privacy and Security

Data Encryption

Data Access

HTTPS Enforcement

System Reliability

Monitor System Health

Uptime Goals

Common Scenarios and Examples

Scenario 1: Transient Network Blip

Scenario 2: Real Downtime

Scenario 3: Quick Recovery

Scenario 4: Manual Refresh During First Failure

Scenario 5: Monitor System Error