How PingPuffin's Monitoring System Works

Version: 1.5
Last updated: February 20, 2026


Table of Contents

  1. Overview
  2. Check Intervals and Frequency
  3. Error Detection
  4. Recovery and Status Changes
  5. Manual Updates
  6. Expected HTTP Status Codes
  7. Protection Against Monitor Failures
  8. Notification System
  9. Public Status Pages
  10. Automatic Dashboard Updates
  11. Data Collection and Storage
  12. Technical Specifications
  13. Privacy and Security
  14. System Reliability
  15. Common Scenarios and Examples

Overview

PingPuffin monitors HTTP and HTTPS endpoints 24/7 to ensure your websites are available. Our system is built with a focus on reliability, precision, and transparency.

Key Features

  • Automatic checks every 3 minutes – all monitors are checked on a regular schedule
  • 2-check verification to avoid false alarms
  • 24/7/365 monitoring without breaks
  • Notifications via email, Slack, SMS, WhatsApp, and webhook (downtime, recovery, warnings, SSL expiry, slow response)
  • SSL certificate alerts – notified 8 days and 2 days before your certificate expires
  • Slow response alerts – optional alert when your site responds too slowly (Settings)
  • Public status pages – share a status page with your own name and logo
  • Instant recovery when your site is back
  • Manual updates for quick verification
  • Protection against internal errors in the monitor system

Why Transparency Matters

We believe in openness about how our monitoring works. This document explains exactly how we detect downtime, how we avoid false alarms, and how we ensure you get notified as quickly as possible when there's a real problem.


Check Intervals and Frequency

Automatic Checks

All active monitors are checked automatically every 3 minutes. Checks run in parallel so all your monitors are checked quickly and evenly.

  • Checks run at fixed intervals around the clock
  • No breaks, no weekends, no holidays
  • You don’t need to do anything – monitoring runs automatically

Manual Updates

You can always trigger an instant check via the "Update now" button in your dashboard:

  • Result shown immediately
  • Bypasses 2-check threshold for quick feedback
  • Useful after deployments or configuration changes

Coverage

  • Availability: 24/7/365
  • Parallel checks: All monitors checked simultaneously
  • Timeout: Standard 30 seconds (configurable)
  • Maximum redirects: Up to 5 follow requests

Error Detection

2-Check Verification System

To avoid false alarms, we require 2 consecutive failures before marking a site as down.

How It Works

First Failure (00:00):

  • Failure counter set to 1
  • Status remains unchanged (e.g., UP)
  • No notification sent
  • System logs failure for internal monitoring

Second Failure (00:05):

  • Failure counter updated to 2
  • Status changes to DOWN
  • Incident created automatically
  • Notification sent to all configured channels

Total time: ~5-10 minutes from first failure to DOWN status.

Why 2 Checks?

Transient network problems (DNS blips, brief timeouts, temporary server errors) occur even on stable sites. By requiring 2 failures:

  • ✅ We eliminate false alarms from brief problems
  • ✅ We confirm there's a real problem
  • ✅ We improve user trust in notifications

What Counts as a Failure?

The following situations are marked as failures:

HTTP Error Codes

  • 4xx Client Errors: 400, 403, 404, 405, etc. (unless explicitly allowed)
  • 5xx Server Errors: 500, 502, 503, 504, etc.

Network Errors

  • Timeout: No response within timeout period (default: 30 seconds)
  • Connection Refused: Server actively rejects connections
  • DNS Failure: Cannot resolve domain name
  • Network Unreachable: Host not available on network

SSL/TLS Errors

  • Invalid Certificate: Certificate is invalid
  • Expired Certificate: Certificate has expired
  • Untrusted Certificate: Certificate not from trusted CA
  • Hostname Mismatch: Certificate hostname doesn't match URL

Slow response time

  • Response time over threshold: If you have enabled “Notify on slow response time” (Settings), a check counts as a failure when response time exceeds your threshold (5–15 seconds). The same 2-check logic applies: two consecutive slow checks result in down status and a notification.

Handling Different Error Types

Important: ALL error types count toward the failure counter.

Example:

Check 1: HTTP 500 → Failure counter: 1, Status: UP (waiting for confirmation)
Check 2: Timeout  → Failure counter: 2, Status: DOWN (confirmed failure)

Rationale:

  • If a server switches between different error types, it indicates instability
  • It's not less serious if the error type changes
  • Any failure means the site is not functioning correctly

Recovery and Status Changes

Instant Recovery

When your site comes back, we react immediately – no 2-check threshold for recovery.

Recovery flow:

Status: DOWN
Check 1: Site responds with HTTP 200 → Failure counter reset, Status: UP
Result: Instant recovery, notification sent

Why instant recovery?

  • ✅ Users want quick feedback when their site is back
  • ✅ No reason to wait for confirmation that something works
  • ✅ Best practice in monitoring
  • ✅ Reduces worry and waiting time

Status States

🟢 UP (Online)

Meaning:

  • Site responds with an expected HTTP status code (default: 2xx and common 3xx)
  • Response time within acceptable range
  • No errors detected

Notifications:

  • Sent on recovery from DOWN status

🔴 DOWN (Offline)

Meaning:

  • Site failed 2+ consecutive checks
  • Incident created and tracked
  • All configured notification channels alerted

Duration:

  • Recorded from first DOWN check to recovery
  • Shown in incident log with precise duration

🟡 WARNING (Warning)

Meaning:

  • Site responds but with warnings
  • Examples: Slow response time, Cloudflare challenge detected
  • Monitoring continues normally

Notifications:

  • Can be configured per user

🔵 REDIRECT (Redirect)

Meaning:

  • Permanent redirect (301) detected to another URL
  • Site is functional but URL has changed
  • You can choose to update URL or continue monitoring original

Notifications:

  • Can be configured per user

Cloudflare-Protected Sites

Automatic Handling:

PingPuffin automatically handles sites behind Cloudflare. Behaviour depends on the HTTP status code:

5xx (e.g. 503 Service Unavailable): All server errors (500, 502, 503, 504, etc.) are always marked as "Down" and trigger notifications and downtime in the chart – even when the response contains Cloudflare branding. Cloudflare often proxies 503 from your origin and wraps it in its own error page; we interpret 5xx as the service being unavailable.

4xx (e.g. 403 Forbidden, 429 Too Many Requests): If the server responds with 403 or 429 and we detect Cloudflare (headers or content), the monitor is marked "Problematic" (warning) instead of "Down". This usually means Cloudflare bot protection or rate limiting – we cannot tell if your site is actually down, only that protection is blocking our check.

Why "Problematic" for 4xx with Cloudflare?

If we get an HTTP response (e.g. 403), the server is reachable. Uptime monitoring is about availability – for 4xx with Cloudflare we assume protection is blocking the check, not that the site itself is down. We therefore show "Problematic" so you know protection is active.

Optional: Configure Cloudflare for Better Monitoring

If you want to avoid "Problematic" status for your Cloudflare-protected site, you can:

  1. WAF Rules: Create a rule that allows requests from PingPuffin's User-Agent:

    Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
  2. Browser Integrity Check: Consider disabling this for monitoring IPs

  3. Rate Limiting: Adjust rate limiting so monitoring requests aren't blocked

PingPuffin's Monitoring IP: 188.245.198.146 (for whitelisting if needed)


Manual Updates

User-Initiated Checks

You can always trigger an instant check via the "Update now" button in the dashboard.

Behavior:

  • Bypass 2-check threshold: Result shown and applied immediately
  • Instant status update: If status changes, it updates immediately
  • "Last checked" updates: The dashboard shows the new check time right away; in the overview you may see "Updated just now" for the monitor you refreshed
  • Notification: Sent if status changes
  • Failure counter updated: Counter updated based on result

Use Cases

  • ✅ Quick verification after deployment of fixes
  • ✅ Testing new monitor configuration
  • ✅ Instant status check without waiting for the next automatic check
  • ✅ Debugging connection problems

Example:

00:00 - Automatic check fails → Failure counter: 1, Status: UP
00:02 - You click "Update now" → Check fails → Status: DOWN instantly
Result: Manual check skips 2-check threshold for quick feedback

Expected HTTP Status Codes

A monitor is considered UP when the server returns an HTTP status code that you have marked as acceptable. By default, PingPuffin treats these codes as success:

Default accepted codes

2xx Success

  • 200 OK
  • 201 Created
  • 202 Accepted
  • 203 Non-Authoritative Information
  • 204 No Content
  • 205 Reset Content
  • 206 Partial Content

3xx Redirects (after following redirects, the final response is evaluated)

  • 301 Moved Permanently
  • 302 Found
  • 303 See Other
  • 304 Not Modified
  • 307 Temporary Redirect
  • 308 Permanent Redirect

Any other status code (e.g. 4xx or 5xx) is treated as a failure unless you explicitly add it to your monitor’s expected status codes.

Custom expected status codes

You can define which codes count as success per monitor. For example:

  • 401 Unauthorized for a login or protected page that correctly returns “auth required”
  • 403 Forbidden if that response is expected for the URL you monitor
  • 404 Not Found only if you intentionally monitor a “not found” page

When you create or edit a monitor, you can set a comma-separated list of expected codes (e.g. 200,201,401). If you leave it at the default, the full list above is used.

Where this applies

  • All HTTP(S) uptime checks use this rule (automatic checks and manual “Update now”).
  • Status is determined by comparing the final response code (after redirects) to your expected list.
  • See Technical Specifications for timeouts, redirect limits, and other request details.

Protection Against Monitor Failures

Internal vs. External Errors

It's critical to distinguish between errors in your site and errors in our monitor system.

Site Errors (Monitored)

These errors from your site count as failures:

  • ✅ HTTP 500 from target site → Counts as failure
  • ✅ Timeout connecting to target site → Counts as failure
  • ✅ DNS error for target domain → Counts as failure

Monitor System Errors (Protected)

These errors in PingPuffin's code do NOT mark your site as down:

  • ❌ PHP exception in PingPuffin code → Does NOT mark site as down
  • ❌ Database connection error → Does NOT mark site as down
  • ❌ Internal logic error → Does NOT mark site as down

Administrator Alerting

When the monitor system fails:

Logging:

  • Critical errors logged with full stack trace
  • Timestamp and monitor ID recorded
  • All details saved for debugging

Email Alarm:

  • Email sent to system administrator
  • Contains error message, stack trace, and context
  • Rate-limited: Maximum 1 email per hour per unique error
  • Prevents email flooding during system problems

Database:

  • Status remains unchanged (no false downs)
  • No incident created
  • Users not affected

Example:

Monitor checker runs
→ Internal error detected in monitor system
→ Error logged with full context
→ Email sent to system administrator
→ Database NOT updated
→ Your site status remains unchanged

Notification System

When Notifications Are Sent

  • Downtime (DOWN): After 2 consecutive failures confirmed (~5–10 min)
  • Recovery (UP): Instantly when your site comes back from DOWN
  • Warning and redirect: Configurable per user (redirect, warning)
  • Slow response: If you’ve enabled it in Settings – two consecutive slow checks result in DOWN status and a notification
  • SSL certificate expiring soon: You’re notified 8 days before and 2 days before expiry (same channels as other notifications)
  • Manual update: Instant notification if status changes when you click “Update now”

Notification Channels

You can enable one or more channels under Integrations (Apps). All enabled channels receive the same events (downtime, recovery, SSL expiry, slow response, etc.).

Email

  • Direct email notifications
  • Contains: site name, status, error message, timestamp, and link to dashboard

Slack

  • Message to your configured channel or DM
  • Formatted with colors (red = down, green = up) and link to monitor

SMS

  • Short message to your mobile number for downtime, recovery, SSL expiry, and slow response
  • Requires adding and enabling SMS under Integrations

WhatsApp

  • Message to your WhatsApp number for downtime, recovery, SSL expiry, and slow response
  • Requires adding and enabling WhatsApp under Integrations

Webhook

  • POST request to your own endpoint with JSON (status, response time, error message, etc.)
  • Useful for integrating with other systems

Notification Snoozing

You can temporarily disable notifications (snooze) for 24 hours:

During Snooze:

  • ✅ Monitoring continues normally
  • ✅ Status updates in dashboard
  • ❌ No notifications sent to any channel
  • ⏰ Automatically un-snoozed after 24 hours

Use Cases:

  • Planned maintenance
  • Known issue during rollout
  • Temporary shutdown

Public Status Pages

For each monitor (or group) you can show a public status page that anyone can open without logging in. The page shows current status (up/down/warning) and history.

  • Customization: You can set a display name and logo (URL) for the group so the status page matches your brand. Changes apply to both the dashboard and the public page.
  • Language: The status page can be shown in Danish, English, or Spanish (based on the visitor’s settings).
  • Share the link: You get a stable link to the status page to share with users or embed on your own site.

Automatic Dashboard Updates

Auto-Refresh Mechanism

Your dashboard updates automatically every 30 seconds with the latest data and does not trigger new checks – you always see the latest status without reloading the page.

What Gets Updated?

Dashboard always shows latest data from the last automatic check:

  • 🎨 Status indicator: Colored badge (green/red/yellow/blue)
  • Last checked: Precise timestamp of last check
  • Response time: Response time in milliseconds
  • 🔢 Failure counter: Number of consecutive failures
  • 📊 Incident info: Active incidents and duration

Important: Auto-refresh respects 2-check logic because it only shows data from automatic checks – it does not trigger new checks.


Data Collection and Storage

Check Records

Every single check is saved in the database with the following information:

  • Unique ID for check
  • Reference to monitor
  • Timestamp (when check was performed)
  • HTTP status code or error type
  • Response time in milliseconds
  • Success/failure status
  • Error message (if relevant)
  • Redirect information (if relevant)
  • SSL error details (if relevant)

Usage:

  • Uptime percentage calculations
  • Historical graphs and reports
  • Error analysis and debugging
  • Performance tracking over time

Incident Tracking

When status changes to DOWN, an incident is created with the following data:

  • Unique ID for incident
  • Reference to monitor
  • Start timestamp
  • End timestamp (when resolved)
  • Total duration

Features:

  • Automatic creation on DOWN status
  • Continuous duration calculation
  • Automatic resolution on recovery
  • Complete history maintained
  • Exportable to CSV

Activity Log

All significant events are logged:

  • ✅ Status changes (UP → DOWN, DOWN → UP, etc.)
  • ✅ Manual checks performed by users
  • ✅ Configuration changes
  • ✅ URL updates
  • ✅ Metadata updates

Functionality:

  • Exportable to CSV format
  • Searchable and filterable
  • Shows details for each event
  • Timestamps on all entries

Technical Specifications

HTTP Request Parameters

When PingPuffin checks your site, the following request is sent:

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
Accept: */*
Connection: close
[Optional: Authorization header for Basic Auth]

Note: We use a standard Chrome user agent to avoid being blocked by sites that filter custom user agents.

Configuration:

  • Timeout: Can be configured per monitor (default: 30 seconds)
  • Follow Redirects: Yes, up to 5 redirects maximum
  • SSL Verification: Enabled (validates certificates)
  • Connection Reuse: Disabled (fresh connection for each check)

Expected HTTP Status Codes

Monitors consider a check successful when the response status is in that monitor’s expected status codes list. By default we accept a broad set of 2xx and 3xx codes. For the full list and how to customize it, see Expected HTTP Status Codes.

Response Time Measurement

What's Measured:

  • DNS lookup time
  • Connection time (TCP handshake)
  • SSL handshake time (if HTTPS)
  • Time to first byte (TTFB)

What's NOT Measured:

  • Body download time (we only read headers)
  • JavaScript execution time
  • Asset loading time

Storage:

  • Measured in milliseconds
  • Saved at each check
  • Used for performance tracking
  • Shown in dashboard

Advanced Monitoring Settings

For advanced users, we offer:

HTTP Method

  • GET: Standard method
  • POST: For endpoints that require POST

Request Body

  • Send JSON or form data with POST requests
  • Useful for API endpoints that require specific data

Basic Authentication

  • Username and password for protected endpoints
  • Passwords encrypted with AES-256-CBC
  • Never stored in plain text

Future Features

  • Custom headers
  • Request params
  • Advanced authentication (OAuth, Bearer tokens)

Server Information

Public IP Address

PingPuffin's monitoring server uses the following public IP address to perform checks:

IP Address: 188.245.198.146

If you need to whitelist PingPuffin's IP address in your firewall or server configuration, you can use this IP address.

Note: The IP address may change during server migrations or infrastructure updates. We recommend using the User-Agent header for identification instead of IP-based rules, if possible.

User-Agent Identification:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36

Privacy and Security

Data Encryption

Sensitive Credentials:

  • All passwords (Basic Auth) encrypted on storage
  • Algorithm: AES-256-CBC (industry standard)
  • Key: Stored securely in environment variables
  • IV: Unique initialization vector per encryption
  • Decryption only happens in memory during checks
  • Never exposed in logs or API responses

Data Access

Access Control:

  • Check results only visible to monitor owner
  • No data shared with third parties
  • Activity log only exportable by owner
  • Secure API endpoints with authentication

Database Security:

  • Prepared statements (prevents SQL injection)
  • Session-based authentication
  • CSRF protection on all forms

HTTPS Enforcement

SSL/TLS:

  • All monitor checks support HTTPS
  • SSL certificate validation enabled
  • Warns on certificate problems
  • Detects expired certificates

Dashboard:

  • Always accessed via HTTPS
  • Secure cookies (HttpOnly, Secure flags)
  • HSTS headers recommended

System Reliability

Monitor System Health

Our monitor system monitors itself:

Error Detection:

  • Automatic detection of internal errors
  • Full logging of all exceptions
  • Stack traces for debugging

Administrator Alerts:

  • Critical errors emailed to system administrator
  • Rate-limited to avoid spam
  • Details included for quick resolution

Automatic Recovery:

  • The system continues checking other monitors even if one fails
  • No cascade failures across monitors
  • Database transactions ensure data integrity

Uptime Goals

We aim for the following reliability:

Monitor System Uptime:

  • Goal: 99.9% (less than 9 hours downtime per year)
  • Monitored: Via internal logs and system metrics

Check Execution Rate:

  • Goal: 99.5% success rate for check execution
  • Monitored: Error rate in logs

Check reliability: Monitoring runs continuously; on technical issues the administrator is alerted so the system stays stable.


Common Scenarios and Examples

Scenario 1: Transient Network Blip

Situation: Brief network problem, site is actually up.

00:00 - Check fails (timeout) → Failure counter: 1, Status: UP
00:05 - Check succeeds → Failure counter: 0, Status: UP

Result:
✅ No notification sent
✅ No status change
✅ No false alarm

Why it works: 2-check threshold catches brief problems.


Scenario 2: Real Downtime

Situation: Server is really down (e.g., hosting problem).

00:00 - Check fails (HTTP 500) → Failure counter: 1, Status: UP
      → System logs first failure for internal monitoring

00:05 - Check fails (timeout) → Failure counter: 2, Status: DOWN
      → Incident created automatically
      → Notification sent via email/Slack/webhook

00:10 - Check fails (timeout) → Failure counter: 3, Status: DOWN
      → Incident duration updated continuously

Result:
✅ DOWN status confirmed at 00:05
✅ Notification sent ~5 minutes after first failure
✅ Different error types (500 + timeout) both count

Why it works: Two consecutive failures confirm real problem.


Scenario 3: Quick Recovery

Situation: Site down, comes back quickly.

00:00 - Status is DOWN (from previous failure)
00:05 - Check succeeds (HTTP 200) → Failure counter: 0, Status: UP
      → Incident marked as resolved automatically
      → Recovery notification sent

Result:
✅ Instant recovery on first successful check
✅ Incident duration calculated (00:00 to 00:05 = 5 min)
✅ You're informed quickly about recovery

Why it works: No 2-check threshold for recovery.


Scenario 4: Manual Refresh During First Failure

Situation: Automatic check failed once, user wants to verify.

00:00 - Automatic check fails → Failure counter: 1, Status: UP
      → Status remains UP (waiting for confirmation)

00:02 - You click "Update now" → Check fails → Status: DOWN immediately
      → Manual check bypasses 2-check threshold for quick feedback
      → Notification sent

Result:
✅ Manual check gives instant feedback
✅ Status updates without waiting for next automatic check
✅ Useful for debugging and verification

Why it works: Manual checks are designed for instant feedback.


Scenario 5: Monitor System Error

Situation: Internal error in PingPuffin's own code.

00:00 - Monitor system runs automatic check
      → Internal error detected
      → Error caught automatically

      Logging:
      → Error logged with full context for internal monitoring
      → Full technical information saved for debugging

      Administrator Alarm:
      → Email sent to system administrator (max 1 per hour)
      → Contains error details and context

      Database:
      → No update to your site status
      → Your site status remains unchanged
      → Failure counter not affected

Result:
✅ Monitor error does NOT affect your site status
✅ Administrator alerted to fix problem
✅ No false DOWN status

Why it works: Distinction between monitor errors and site errors.


Scenario 6: Different Error Types Consecutively

Situation: Server unstable, different errors each time.

00:00 - Check fails (HTTP 500) → Failure counter: 1, Status: UP
00:05 - Check fails (Connection timeout) → Failure counter: 2, Status: DOWN
00:10 - Check fails (HTTP 503) → Failure counter: 3, Status: DOWN

Result:
✅ Different error types ALL count
✅ Status DOWN after 2 failures (regardless of type)
✅ Indicates unstable server (maybe worse than one consistent error)

Why it works: Any error means site not functioning correctly.


Frequently Asked Questions

How quickly do I get notified of downtime?

Automatic check: ~5-10 minutes after first failure (requires 2 failures).
Manual check: Instantly if you update manually.

Can I get false alarms?

Very rarely. The 2-check system eliminates most brief problems. If you get an alarm, there's almost always a real problem.

What if my server is temporarily slow?

If response time exceeds timeout (default 30 sec), it counts as failure. You can increase timeout value for your monitor.

How is planned maintenance handled?

Use the "Snooze" function to disable notifications for 24 hours. Monitoring continues, but you get no alarms.

Can I see history for all checks?

Yes, the activity log shows all checks and status changes. You can also export to CSV.

What happens if PingPuffin itself goes down?

Our monitors run on reliable infrastructure. On critical system errors, administrator is alerted, but your site is NOT marked as down.


Contact & Support

Have questions about how monitoring works?

📧 Email: support@pingpuffin.com
🐛 Bug reports: Via email
📚 Documentation: See documentation section for more information


Changelog

v1.5 (February 20, 2026)

  • SMS and WhatsApp added as notification channels (under Integrations)
  • SSL certificate alerts: notification 8 days and 2 days before certificate expires (all channels)
  • Slow response alerts: option to be notified when the site responds too slowly (Settings, 5–15 sec)
  • Public status pages: described customization with display name and logo, and language support
  • Technical implementation details removed from user documentation (focus on what you see and can configure)

v1.3 (February 14, 2026)

  • Version and documentation date update

v1.2 (February 9, 2026)

  • New section: Expected HTTP Status Codes with full default list (200, 201, 202, 203, 204, 205, 206, 301, 302, 303, 304, 307, 308) and customization
  • Manual “Update now”: dashboard “Last checked” and overview “Updated just now” feedback documented

v1.1 (January 17, 2026)

  • Checks now run in parallel (faster throughput)
  • Check interval reduced from 5 to 3 minutes
  • Each category can check up to 200 monitors per run
  • Documentation updated with parallel processing details

v1.0 (November 21, 2024)

  • First version of documentation
  • 2-check verification system implemented
  • Monitor failure protection added
  • Rate-limited administrator alerts

This document is updated continuously. Check "Last updated" at the top to see if there are new versions.

📅 Last updated: March 1, 2026 ⏱️ 1 days ago