Kafka authorization failures: ACL denials, wrong-topic clients, and audit trails
Your consumers or producers throw TOPIC_AUTHORIZATION_FAILED or CLUSTER_AUTHORIZATION_FAILED. Your security team spots repeated log lines in the broker logs. You need to know within minutes whether this is a deployment misconfiguration, a client pointing at the wrong topic, or a security incident.
In clusters where authorizer.class.name is configured, Kafka writes authorization decisions to kafka-authorizer.log. Denials log at INFO level; allowed operations are silent unless you enable DEBUG logging for the authorizer.
What this means
When a Kafka broker has an authorizer configured, the default is deny unless an explicit Allow ACL matches the principal, resource, and operation. Authorization decisions are written to logs/kafka-authorizer.log. A denial means the authenticated principal attempted an operation on a resource with no matching Allow ACL.
In KRaft mode, ACLs are stored in the __cluster_metadata topic and enforced by org.apache.kafka.metadata.authorizer.StandardAuthorizer. In ZooKeeper-based clusters, ACLs are stored in ZooKeeper and enforced by kafka.security.authorizer.AclAuthorizer. Super users configured via super.users bypass all ACL checks. The allow.everyone.if.no.acl.found setting defaults to false, so resources without ACLs are not world-accessible.
Occasional denials are almost always misconfigurations: a new service missing an ACL, a developer pointing a consumer at prod-events instead of prod-events-v2, or a staging credential reused in production. Sustained denials from an authenticated principal making requests outside its normal pattern suggest a compromised credential, lateral movement, or probing.
The distinction matters because the response differs. Misconfigurations are fixed with kafka-acls.sh. Security incidents require credential rotation, scope investigation, and escalation.
flowchart TD
A[Authorization denial in kafka-authorizer.log] --> B{Occasional or sustained?}
B -->|Occasional / deployment correlated| C[ACL misconfig or wrong-topic client]
B -->|Sustained from known principal| D[Compromised credential or application bug]
B -->|Sustained from unexpected principal| E[Unauthorized access or lateral movement]
C --> F[kafka-acls.sh --list to verify]
D --> G[Check client logs and credential rotation history]
E --> H[Escalate security incident]
F --> I[Fix ACL or client configuration]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Misconfigured ACL | Denied operations from a known application right after deployment or credential rotation | kafka-authorizer.log for the principal and exact resource name |
| Client targeting wrong topic | A producer or consumer logs authorization failures for a topic it should not access | Client configuration for topic.name or subscription regex |
| Compromised or over-permissioned credential | Sustained denials from a principal making requests outside its normal pattern | Historical audit trail for that principal’s typical resources |
| Missing broker principal ACLs (PLAINTEXT inter-broker) | Internal broker errors like UpdateMetadata denials after enabling authorization | security.inter.broker.protocol and whether broker traffic authenticates as ANONYMOUS |
| KRaft controller missing authorizer | No Authorizer is configured errors from kafka-acls.sh despite broker config | authorizer.class.name on both broker and controller processes |
| False client-side authorization failure | KafkaJS producer reports authorization failure for a topic it previously produced to | Client library version and known issues like KafkaJS #1346 |
Quick checks
Run these read-only checks before making changes.
# Check recent authorization denials on a broker
grep -E "DENIED|Denied" /var/log/kafka/kafka-authorizer.log | tail -n 50
# Verify the configured authorizer class
grep "authorizer.class.name" /etc/kafka/server.properties
# List ACLs for a specific principal
kafka-acls.sh --bootstrap-server localhost:9092 --list --principal User:app-producer
# List ACLs for a specific topic
kafka-acls.sh --bootstrap-server localhost:9092 --list --topic orders --operation Read
# Check super users configured on the broker
grep "super.users" /etc/kafka/server.properties
# Check for admin operations in broker logs (topic/ACL/config changes)
grep -E "CreateTopics|DeleteTopics|AlterConfigs|CreateAcls|DeleteAcls" /var/log/kafka/server.log | tail -n 20
On AWS MSK, broker-side kafka-authorizer.log is not exposed to customers. Use CloudWatch Logs or MSK access logging. The kafka-acls.sh checks still work via the --bootstrap-server endpoint.
How to diagnose it
- Filter the authorizer log for the incident window. Look for lines containing the denied principal and resource type (Topic, Group, Cluster). The log includes the principal, operation, resource, and decision.
- Determine if the principal is known. If it belongs to a newly deployed service, this is likely a missing ACL. If it is an existing production service, check whether the resource is new or unexpected.
- Verify existing ACLs with
kafka-acls.sh --list. Check both the principal and the resource. Principal names are case-sensitive. Prefixed ACLs match any resource starting with the given prefix. - Check for admin operations around the time of first denial. Unauthorized ACL changes, topic creation, or config alterations can explain sudden denials. Look in broker logs or audit logs for changes outside a change window.
- Distinguish authZ from authN. If the client cannot authenticate, it never reaches the authorizer. Check
server.logforAuthenticationExceptionor JMXfailed-authentication-rateto rule out authentication problems. - Validate the client library. Some libraries report authorization failures during reconnection even when ACLs are correct. If the denial is intermittent and the client previously produced successfully, verify the client version and check for known issues.
- Check KRaft controller configuration. In KRaft mode with separated controller and broker processes,
authorizer.class.name=org.apache.kafka.metadata.authorizer.StandardAuthorizermust be set on both. Ifkafka-acls.shreturns “No Authorizer is configured,” verify the controller config. - Assess the scope and duration. Occasional denials that stop after a few minutes suggest a transient misconfiguration. Sustained denials that continue for tens of minutes, especially from a principal trying multiple resources, warrant escalation as a potential security event.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Authorization failure rate | Direct measure of ACL denials from kafka-authorizer.log | Sustained rate from a principal with no recent history of accessing that resource |
| Authentication failure rate | Distinguishes authN from authZ problems | Sustained AuthenticationException before denials appear |
| Admin operations / config changes | Unauthorized changes cause sudden authorization shifts | Topic creation, deletion, or ACL modification outside change windows |
UnderReplicatedPartitions | Broker principal denials can block internal replication | URP rises after enabling ACLs without granting Cluster operations to brokers |
| Consumer group lag | Authorization failures block consumers, causing lag to grow | Lag grows for a group whose members recently started logging denials |
FailedProduceRequestsPerSec | Producers receive hard errors when denied | Spike correlating with authorization denials |
FailedFetchRequestsPerSec | Consumers receive hard errors when denied | Spike on topics with new ACL restrictions |
Fixes
Fix the ACL
If kafka-acls.sh --list shows no matching Allow ACL, add the minimum required permission. Prefer resource-specific ACLs over wildcard grants.
# Example: grant Write and Describe on a topic to a producer principal
kafka-acls.sh --bootstrap-server localhost:9092 --add \
--allow-principal User:app-producer \
--operation Write --operation Describe \
--topic orders
Tradeoff: Wildcard topic ACLs (--topic '*') reduce operational overhead but increase blast radius if the principal is compromised.
Fix the client configuration
If the client targets the wrong topic or consumer group, fix the application configuration. Common mistakes include hardcoded topic names from a previous environment, regex subscriptions that match unintended topics, or MirrorMaker configs pointing to the wrong cluster.
Tradeoff: Fixing the client requires a deployment, which may take longer than adding an ACL. If the client should not access that resource, do not add the ACL.
Resolve inter-broker authorization failures
If brokers use PLAINTEXT for security.inter.broker.protocol and ACLs are enabled, internal requests authenticate as ANONYMOUS and may be denied. Switch inter-broker communication to SASL_PLAINTEXT or SASL_SSL and ensure the broker principal has Cluster-level ACLs.
Tradeoff: Switching inter-broker protocol requires a rolling restart. Granting broad Cluster ACLs to brokers reduces security boundaries.
Escalate sustained anomalous denials
If a principal makes sustained authorization requests to resources it has never touched before, treat it as a potential security incident. Rotate the principal’s credentials, audit its recent activity, and check whether other principals from the same source exhibit similar behavior.
Prevention
- Enable DEBUG logging for allowed operations. Set
log4j.logger.kafka.authorizer.logger=DEBUGfor log4j1 orlogger.authorizer.level=DEBUGfor log4j2. Without this, allowed operations are silent and you cannot build a baseline. Expect high log volume; use short-lived toggles or dedicated appenders. - Require ACL changes through automation. Topic and ACL provisioning should run through CI/CD or infrastructure-as-code with mandatory review. Ad-hoc
kafka-acls.shcommands in production create audit gaps. - Monitor admin operations outside change windows. Any topic creation, deletion, or ACL modification by an unexpected principal should generate a ticket immediately.
- Validate client configs in pre-production. Run integration tests with the same authorizer configuration as production to catch wrong-topic errors before deployment.
- Document principal-to-resource mappings. Maintain a living document or repo that defines which services access which topics. This accelerates diagnosis when a principal is denied.
How Netdata helps
- Correlate authorization failure rates with broker health metrics (
RequestHandlerAvgIdlePercent,UnderReplicatedPartitions) to detect instability triggered by ACL changes. - Track authentication and authorization event trends over time to establish baselines for normal principal behavior.
- Alert on sustained authorization denials alongside consumer group lag growth to detect when ACL issues block data flow.
- Monitor broker log patterns for admin operations, surfacing unexpected topic or ACL changes that precede authorization incidents.
Related guides
- How Kafka actually works in production: a mental model for operators: /guides/kafka/how-kafka-works-in-production/
- Kafka enable.auto.commit data loss: committed offsets that outrun processing: /guides/kafka/kafka-auto-commit-silent-data-loss/
- Kafka ‘Broker may not be available’: clients that can’t connect or stay connected: /guides/kafka/kafka-broker-may-not-be-available/
- Kafka broker out of disk: log.dirs full, the cliff-edge shutdown, and recovery: /guides/kafka/kafka-broker-out-of-disk/
- Kafka network egress saturation: BytesOutPerSec, replication amplification, and fan-out: /guides/kafka/kafka-bytes-out-network-saturation/
- Kafka CommitFailedException: rebalanced-out consumers and poll loop timeouts: /guides/kafka/kafka-commit-failed-exception/
- Kafka connection storms: connection-count spikes, FD pressure, and network threads: /guides/kafka/kafka-connection-count-storm/
- Kafka consumer group stuck Empty or Dead: no members consuming: /guides/kafka/kafka-consumer-group-empty-stuck/
- Kafka consumer group lag growing: detection, lag-as-time, and root causes: /guides/kafka/kafka-consumer-group-lag-growing/
- Kafka consumer group rebalancing too often: heartbeats, session timeout, and assignors: /guides/kafka/kafka-consumer-group-rebalancing-frequently/
- Kafka __consumer_offsets growing huge: compaction failure on the offsets topic: /guides/kafka/kafka-consumer-offsets-topic-growing/
- Kafka consumer rebalance storm: stuck in PreparingRebalance and max.poll.interval.ms: /guides/kafka/kafka-consumer-rebalance-storm/







