Centos7在出现系统异常的情况下,比如iscsi软件栈出错且有持续IO的情况下,很可能会出现持续大量的日志,就像短时强风暴一样,甚至导致日志丢失。

在调试阶段,这些日志都是需要的,为此需要解决此时出现的log drop。方法是修改/etc/rsyslog.conf,加入下面的几行:

$SystemLogRateLimitInterval 0

$SystemLogRateLimitBurst 0

$IMUXSockRateLimitInterval 0

$IMJournalRatelimitInterval 0

"/etc/rsyslog.conf" 106L, 3683C

此外,我们还要求日志能及时落盘,为此也需要修改journald配置文件/etc/systemd/journald.conf,更新下面几行:

Storage=persistent

RateLimitInterval=0

RateLimitBurst=0

SyncIntervalSec=2

除此之外,还需要disable 日志盘的写缓存:

[root@192.168.1.84:~]$ hdparm -W 0 /dev/sda

/dev/sda:

setting drive write-caching to 0 (off)

write-caching =  0 (off)

最后重启服务:

systemctl daemon-reload

[root@localhost etc]# systemctl restart systemd-journald.service

[root@localhost etc]# systemctl restart rsyslog.service

或者重启机器就可生效。在本人的机器上,重启多次之前kernel出错及不少日志丢失的问题,根据上面的改动都被解决了。

参考:

1. man journald.conf

RateLimitInterval=, RateLimitBurst=

Configures the rate limiting that is applied to all messages generated on the system. If, in the

time interval defined by RateLimitInterval=, more messages than specified in RateLimitBurst= are

logged by a service, all further messages within the interval are dropped until the interval is

over. A message about the number of dropped messages is generated. This rate limiting is applied

per-service, so that two services which log do not interfere with each other's limits. Defaults to

1000 messages in 30s. The time specification for RateLimitInterval= may be specified in the

following units: "s", "min", "h", "ms", "us". To turn off any kind of rate limiting, set either

value to 0.

SyncIntervalSec=

The timeout before synchronizing journal files to disk. After syncing, journal files are placed in

the OFFLINE state. Note that syncing is unconditionally done immediately after a log message of

priority CRIT, ALERT or EMERG has been logged. This setting hence applies only to messages of the

levels ERR, WARNING, NOTICE, INFO, DEBUG. The default timeout is 5 minutes.

2. man journalctl

3.