I've been slowly rolling out Zabbix in our environment and have hit a couple of stumbling blocks, I'm hoping someone might be able to share insight/tips/personal experiences.
We have 5 datacenter's that need to be monitored. Each datacenter has an active proxy that talks to the Zabbix server located in our primary datacenter.
We're monitoring servers and network switches/routers. All servers have the agent and do active monitoring. Network devices are monitored via SNMPv2.
To cut down on chatter between datacenter's, the primary goal is to have all hosts/devices within a datacenter communicate only with that datacenter's proxy. For the same reason, agent auto-registration is preferable to network discovery.
A secondary goal is to keep configurations as simple and uniform as possible. Yes, we could work out a process that updates ServerActive in zabbix_agentd.conf as part of a hosts build, pointing to the correct proxy. But I think that would add a layer of
complexity and won't easily scale.
What we've done
We have F5 loadbalancers. I've defined a VIP that routes traffic to the correct destination proxy based on source IP. The F5 is doing SNAT auto-map, so the proxies see the VIP when an agent checks in. Now all agents have a common configuration with a single
IP in ServerActive, exactly what we wanted. And when a host comes online for the first time, it reaches out to the correct proxy and auto-registers.
But now I'm thinking about how this is going to scale. And there is an issue with auto-registration, the network interface of all newly created hosts is the VIP. We're not doing any passive checks right now, but that could change. If I disable SNAT on
the F5 so that the proxies see the actual source IP of the agent, everything breaks… and I'm assuming that's because only the VIP is listed in ServerActive?
If that's the case, one thought would be to add the actual proxy IP's to ServerActive in addition to the VIP, but we want to avoid a host accidentally auto-registering to the wrong proxy. Any way to force auto-registration to the first IP in the list?
Some other thoughts…
I read a thread where someone was doing something similar and changed SourceIP on the proxy to the VIP. I don't know if that would help in our case, but when I tried it all of our SNMP checks started failing.
Another thought I had was to put the F5 in between the proxies and server as well… have only 1 "virtual" proxy (the VIP)… the F5 would still route agents to the correct proxy based on network, and on the server all hosts would be associated with only one
proxy. I actually like this solution, it would make adding proxies in the future dead-simple.
But this doesn't seem to work with our SNMP monitoring… if I understand correctly, all of the proxies would then be performing the same SNMP checks on the same devices. Does that sound right?
This email has started to ramble…
What are other people doing? I've been white boarding this all morning, but the "best" setup isn't jumping out at me…