现象

测试环境app报错,查看日志发现是Dubbo服务调用失败,错误日志大概是这样的

Failed to invoke method xxxService, cause: com.alibaba.dubbo.rpc.RpcException: 
Failed to invoke the method xxx in the service com.xxx.xxxService 
Tried 3 times of the providers [10.40.15.50:50007] (1/1) from the registry zk.xx.com:2181 on the consumer 10.40.11.171 using the dubbo version 2.4.9. 
Last error is: Failed to invoke remote method: xxx, provider: dubbo://10.40.15.50:50007/com.xxx.xxxService

分析

日志是说10.40.11.171这个consumer调用dubbo://10.40.15.50:50007/com.xxx.xxxService这个服务失败,这里有个诡异的现象,我们测试环境的IP都是10.40.11.xx网段的,但是这个provider的IP是10.40.15.50,所以服务调用失败的原因就在这里。

那么问题来了:

这个IP是怎么来的?

先来回顾一下Dubbo服务的调用过程(以zk注册中心为例)
1、provider端先把要暴露出去的服务组装成URL,然后把URL注册到zk的临时节点上;
2、consumer收到zk节点变化的通知,拿到服务的URL,与provider建立单一长连接;
3、consumer通过这个长连接发起服务调用。

因为Dubbo会把服务URL注册到zk中去,先去看看zk中注册的URL是什么

shell> ./zkCli.sh

zk> ls /dubbo/com.dafy.sevend.pushcenter.rpc.api.service.PushCenterService/providers
[dubbo://10.40.15.50:50007/com.xxx.xxxService?....]

zk中注册的URL的IP是这个异常的IP,这里也可以去Dubbo管理平台上去查看。

provider端在暴露服务之后,consumer需要与他建立连接,如果是个异常的IP,这个连接能成功建立吗?去日志里面找一下建立连接的日志

19-04-25 17:16:43.609 [ZkClient-EventThread-21-zk.xx.com:2181] ERROR [] com.alibaba.dubbo.remoting.transport.AbstractClient:74 -  
[DUBBO] Failed to start NettyClient 10.40.11.171 connect to the server /10.40.15.50:50007 (check == false, ignore and retry later!), 
cause: client(url: dubbo://10.40.15.50:50007/com.xxx.xxxService) failed to connect to server /10.40.15.50:50007 client-side timeout 3000ms (elapsed: 3000ms) from netty client 10.40.11.171 using dubbo version 2.4.9, dubbo version: 2.4.9

所以根本原因就是,provider暴露出去的URL的IP错了,跟下Dubbo的代码,看看这个URL是怎么生成出来的

# org.apache.dubbo.config.ServiceConfig.doExportUrlsFor1Protocol
private void doExportUrlsFor1Protocol(ProtocolConfig protocolConfig, List<URL> registryURLs) {
	String name = protocolConfig.getName();
	if (StringUtils.isEmpty(name)) {
		name = Constants.DUBBO;
	}

	Map<String, String> map = new HashMap<String, String>();
	map.put(Constants.SIDE_KEY, Constants.PROVIDER_SIDE);
	appendRuntimeParameters(map);
	appendParameters(map, application);
	appendParameters(map, module);
	appendParameters(map, provider, Constants.DEFAULT_KEY);
	appendParameters(map, protocolConfig);
	appendParameters(map, this);

	// export service
	String contextPath = protocolConfig.getContextpath();
	if (StringUtils.isEmpty(contextPath) && provider != null) {
		contextPath = provider.getContextpath();
	}

	// 获取要暴露出去的URL的host,重点看这里,重点看这里
	String host = this.findConfigedHosts(protocolConfig, registryURLs, map);
	Integer port = this.findConfigedPorts(protocolConfig, name, map);
	URL url = new URL(name, host, port, (StringUtils.isEmpty(contextPath) ? "" : contextPath + "/") + path, map);

	this.urls.add(url);
}
# org.apache.dubbo.config.ServiceConfig.findConfigedHosts
/**
 * Register & bind IP address for service provider, can be configured separately.
 * Configuration priority:
 * environment variables ->
 * java system properties ->
 * host property in config file ->
 * /etc/hosts ->
 * default network address ->
 * first available network address
 */
private String findConfigedHosts(ProtocolConfig protocolConfig, List<URL> registryURLs, Map<String, String> map) {
	// 从 systemProperty 读取配置的IP
	String hostToBind = getValueFromConfig(protocolConfig, Constants.DUBBO_IP_TO_BIND);

	// if bind ip is not found in environment, keep looking up
	if (StringUtils.isEmpty(hostToBind)) {
		// 从dubbo:protocol标签中获取host
		hostToBind = protocolConfig.getHost();
		if (provider != null && StringUtils.isEmpty(hostToBind)) {
			// 从dubbo:provider标签中获取host
			hostToBind = provider.getHost();
		}
		if (isInvalidLocalHost(hostToBind)) {
			try {
				// 获取本机hostname对应的host,重点看这里
				hostToBind = InetAddress.getLocalHost().getHostAddress();
			} catch (UnknownHostException e) {
				logger.warn(e.getMessage(), e);
			}
			if (isInvalidLocalHost(hostToBind)) {
				// 省略部分代码 
				// first available network address
			}
		}
	}

	// registry ip is not used for bind ip by default
	String hostToRegistry = getValueFromConfig(protocolConfig, Constants.DUBBO_IP_TO_REGISTRY);
	if (StringUtils.isEmpty(hostToRegistry)) {
		// bind ip is used as registry ip by default
		hostToRegistry = hostToBind;
	}

	return hostToRegistry;
}

Dubbo在暴露服务获取本机IP时按照这样的优先级来获取
1、系统环境变量
2、java系统属性(启动参数)
3、Dubbo配置文件(不推荐配置,多机部署的话不能共用配置配置文件)
4、本机默认地址(通过hostname获取)
5、第一个可用的地址

假设我们环境变量,启动参数,配置文件里面都没有配置host(官方也是推荐这么做的),则会优先使用本机默认地址,即本机hostname对应的IP地址,所以如果hosts文件里hostname映射的IP地址错误,或者DNS解析错误,就会出现这种情况,provider服务正常启动,consumer却无法正常调用。

shell> hostname
shell> xiaoed.mac.local
shell>
shell> ping xiaoed.mac.local
shell> PING xiaoed.mac.local (10.40.15.50): 56 data bytes
shell> Request timeout for icmp_seq 0
shell> Request timeout for icmp_seq 1
shell> Request timeout for icmp_seq 2

如何避免

  • /etc/hosts里面配置正确的hostname映射
  • 保证DNS的可靠性