There are four golden signals that measure a system's performance and reliability. They are latency, traffic, saturation, and errors. Latency measures how long it takes a particular part of a system to return a result. Latency is important because it directly affects the user experience. Changes in latency could indicate emerging issues. Its values may be tied to capacity demands, and it can be used to measure system improvements. But how exactly is it measured? Sample latency metrics include page load latency, number of requests waiting for a thread, query duration, service response time, transaction duration, time to first response, and time to complete data return. The next signal is traffic which measures how many requests are reaching your system. Traffic is important because it's an indicator of current system demand. It's historical trends are used for capacity planning, and it's a core measure when calculating infrastructure spend. Sample traffic metrics include the number of HTTP requests per second, number of requests for static versus dynamic content, network IO, number of concurrent sessions, number of transactions per second, number of retrievals per second, number of active requests, number of write operations, number of read operations, and number of active connections. The third signal is saturation which measures how close to capacity a system is. It's important to note though that capacity is often a subjective measure that depends on the underlying service or application. Saturation is important because it's an indicator of how full the service is, it focuses on the most constrained resources, and it's frequently tied to degrading performance as capacity is reached. Sample capacity metrics include the percent memory utilization, percent thread pool utilization, percent cache utilization, percent disk utilization, percent CPU utilization, disk quota, memory quota, number of available connections, and number of users on the system. The fourth signal is errors, which are events that measure system failures or other issues. Errors are often raised when a flaw, failure, or fault in a computer program or system causes it to produce incorrect or unexpected results or behave in unintended ways. Errors are important because they may indicate that something is failing. They may indicate configuration or capacity issues. They can indicate service level objective violations, and an error might mean it's time to send out an alert. Sample error metrics include wrong answers or incorrect content, the number of 400 and 500 HTTP codes, the number of failed requests, the number of exceptions, the number of stack traces, servers that fail liveness checks, and the number of dropped connections.