背景
近期业务出现一次问题,三方服务受到攻击,然后其进行紧急处理,将域名指向紧急修改为一个备用机房,但是发现流量没有按照预期切换过去,怀疑是DNS的问题,所以稍微话时间看了下。
分析
引子
java里面有一个InetAddress.getByName()
方法,可以将域名转换为ip,我们尝试一下
public static void main(String[] args) throws UnknownHostException {
System.out.println(InetAddress.getByName("www.baidu.com"));
}
打印出来的结果是www.baidu.com/112.80.248.75
关键代码
InetAddress[] addresses = getCachedAddresses(host);
/* If no entry in cache, then do the host lookup */
if (addresses == null) {
addresses = getAddressesFromNameService(host, reqAddr);
}
从上面可见,基本ip查询分两步
- 从缓存里面取结果
- 如果缓存里面不存在,则从域名服务器获取结果
getCachedAddresses
/*
* Lookup hostname in cache (positive & negative cache). If
* found return addresses, null if not found.
*/
private static InetAddress[] getCachedAddresses(String hostname) {
hostname = hostname.toLowerCase();
// search both positive & negative caches
synchronized (addressCache) {
cacheInitIfNeeded();
CacheEntry entry = addressCache.get(hostname);
if (entry == null) {
entry = negativeCache.get(hostname);
}
if (entry != null) {
return entry.addresses;
}
}
// not found
return null;
}
里面逻辑很简单,从两个cache(成功查询的缓存和查询失败的缓存)里面尝试获取已缓存的域名映射ip,再细看Cache
的get
方法
/**
* Query the cache for the specific host. If found then
* return its CacheEntry, or null if not found.
*/
public CacheEntry get(String host) {
int policy = getPolicy();
if (policy == InetAddressCachePolicy.NEVER) {
return null;
}
CacheEntry entry = cache.get(host);
// check if entry has expired
if (entry != null && policy != InetAddressCachePolicy.FOREVER) {
if (entry.expiration >= 0 &&
entry.expiration < System.currentTimeMillis()) {
cache.remove(host);
entry = null;
}
}
return entry;
}
这里面涉及一个缓存的策略InetAddressCachePolicy
,即涉及缓存的有效时间,有如下几个值的说明
public static final int FOREVER = -1;
public static final int NEVER = 0;
/* default value for positive lookups */
public static final int DEFAULT_POSITIVE = 30;
/* The Java-level namelookup cache policy for successful lookups:
*
* -1: caching forever
* any positive value: the number of seconds to cache an address for
*
* default value is forever (FOREVER), as we let the platform do the
* caching. For security reasons, this caching is made forever when
* a security manager is set.
*/
private static int cachePolicy = FOREVER;
/* The Java-level namelookup cache policy for negative lookups:
*
* -1: caching forever
* any positive value: the number of seconds to cache an address for
*
* default value is 0. It can be set to some other value for
* performance reasons.
*/
private static int negativeCachePolicy = NEVER;
-
FOREVER
表示永久缓存,小于0的值都认为是永久缓存 -
NEVER
表示不缓存 -
DEFAULT_POSITIVE
表示默认缓存时间,这边是30秒 - 任意正整数(单位秒)
那java是如何设置缓存的Policy的呢,我们在InetAddressCachePolicy
这个类里面可以看出一些端倪
// Controls the cache policy for successful lookups only
private static final String cachePolicyProp = "networkaddress.cache.ttl";
private static final String cachePolicyPropFallback =
"sun.net.inetaddr.ttl";
// Controls the cache policy for negative lookups only
private static final String negativeCachePolicyProp =
"networkaddress.cache.negative.ttl";
private static final String negativeCachePolicyPropFallback =
"sun.net.inetaddr.negative.ttl";
这四个属性,是用来控制ttl时间,分别如下
-
networkaddress.cache.ttl
查询成功的缓存时间(第一优先级读取) -
sun.net.inetaddr.ttl
查询成功的缓存时间(第二优先级读取) -
networkaddress.cache.negative.ttl
查询失败的缓存时间(第一优先级读取) -
sun.net.inetaddr.negative.ttl
查询失败的缓存时间(第二优先级读取)
个人推断是为了做一些兼容,导致这个逻辑的产生,而且其获取的方式也不一样,第一优先级的属性用的是Security.getProperty(cachePolicyProp);
方式获取,第二优先级的是用的System.getProperty(cachePolicyPropFallback);
方式获取。我们从jdk附带的配置文件java.security
可以看出
#
# The Java-level namelookup cache policy for successful lookups:
#
# any negative value: caching forever
# any positive value: the number of seconds to cache an address for
# zero: do not cache
#
# default value is forever (FOREVER). For security reasons, this
# caching is made forever when a security manager is set. When a security
# manager is not set, the default behavior in this implementation
# is to cache for 30 seconds.
#
# NOTE: setting this to anything other than the default value can have
# serious security implications. Do not set it unless
# you are sure you are not exposed to DNS spoofing attack.
#
#networkaddress.cache.ttl=-1
# The Java-level namelookup cache policy for failed lookups:
#
# any negative value: cache forever
# any positive value: the number of seconds to cache negative lookup results
# zero: do not cache
#
# In some Microsoft Windows networking environments that employ
# the WINS name service in addition to DNS, name service lookups
# that fail may take a noticeably long time to return (approx. 5 seconds).
# For this reason the default caching policy is to maintain these
# results for 10 seconds.
#
#
networkaddress.cache.negative.ttl=10
所以有如下结论
- 成功查询的缓存security文件没有配置,java底层代码默认设置30s
- 失败查询security文件设置的是10s
如果我们需要自行设置该值,可以调用
Security.setProperty("networkaddress.cache.ttl","100");
Security.setProperty("networkaddress.cache.negative.ttl","100");
或者
System.setProperty("sun.net.inetaddr.ttl","100");
System.setProperty("sun.net.inetaddr.negative.ttl","100");
getAddressesFromNameService
关键代码行
addresses = nameService.lookupAllHostAddr(host);
而nameService的初始化如下
// create the impl
impl = InetAddressImplFactory.create();
// get name service if provided and requested
String provider = null;;
String propPrefix = "sun.net.spi.nameservice.provider.";
int n = 1;
nameServices = new ArrayList<NameService>();
provider = AccessController.doPrivileged(
new GetPropertyAction(propPrefix + n));
while (provider != null) {
NameService ns = createNSProvider(provider);
if (ns != null)
nameServices.add(ns);
n++;
provider = AccessController.doPrivileged(
new GetPropertyAction(propPrefix + n));
}
// if not designate any name services provider,
// create a default one
if (nameServices.size() == 0) {
NameService ns = createNSProvider("default");
nameServices.add(ns);
}
一般情况下,我们是不会自动以nameserver的,所以,其会落到这一步代码
NameService ns = createNSProvider("default");
nameService = new NameService() {
public InetAddress[] lookupAllHostAddr(String host)
throws UnknownHostException {
return impl.lookupAllHostAddr(host);
}
public String getHostByAddr(byte[] addr)
throws UnknownHostException {
return impl.getHostByAddr(addr);
}
};
impl
是一个接口InetAddressImpl
,也是挺诡异的,其有两个实现Inet4AddressImpl
和Inet6AddressImpl
,它们最终都是调用一个native方法
public native InetAddress[]
lookupAllHostAddr(String hostname) throws UnknownHostException
这是一个本地方法的调用,其会查询/etc/hosts和使用/etc/resolv.conf里面配置的nameserver来进行查询。而本地方法也会涉及一些DNS缓存的事情,这边就暂时不详细说明了。
结论
一般对Java应用程序而言,其DNS缓存分几层
- jdk里面的缓存,它会缓存成功查询和失败查询,成功查询默认缓存30s,失败查询默认缓存10s,所以一旦DNS查询失败,会有10s业务中断
- 平台有缓存,这里面的平台指的是linux或者windows平台,它们的nscd等服务点击参考会对查询域名进行一段时间的缓存,当然如果机器上面没有此类服务,则默认没有DNS缓存
- 域名服务器的缓存,由于存在不同的域名服务器,它们内部的缓存时间就不定了。