# Too Many Open Files with fluent-logger

We tend to use fluentd (td-agent) to keep the access log of the service persistently. Fluentd is one of the most reliable and flexible middleware to collect various kinds of application logging. We can quickly broaden the target of logging by using many types of plugins maintained by the community.

As is often the case with the middleware libraries, we have found unexpected behavior of fluentd due to our lack of full knowledge of the library. Our Rails application often loses the application log!

This article will illustrate why that problem happens and caveats when we use fluent-logger in Rails application.

# Problem

Our Rails application has a controller that collects its access log.

require 'fluent-logger'

class OurController < ApplicationController
before_action :setup_fluent_logger
before_action :access_log

def setup_fluent_logger
@log = Fluent::Logger::FluentLogger.new(
nil,
host: 'localhost',
port: 24_224
)
end

def access_log
record = ...
@log.post("test.access", record)
end
end


Initially, I thought recreating the fluent-logger instance whenever it gets access is safe to dump the log. But that’s not true at the end of the day. This controller fails to post logs when it gets many accesses from clients.

# Cause

To detect the cause, I have tried to write a script to reproduce the issue.

require 'singleton'
require 'fluent-logger'

class OurController
include Singleton

def initialize
@counter = 0
setup_logger
end

def setup_logger
@logger = Fluent::Logger::FluentLogger.new(
nil,
host: 'localhost',
port: 24_224
)
end

def post(v)
@logger.post('debug.test', {
a: @counter,
v: v
})
@counter += 1
end
end

puts "Start..."

def post(i)
controller = OurController.instance
controller.setup_logger
controller.post(i)
end

post(i)
}
end

puts "Waiting for #{t}"
t.join
end


When I ran the program with thread_num = 256, it fails to dump the log due to connection failure to the local td-agent, but it did not with thread_num = 200. There is an implicit threshold between 200 and 256, I thought. Here is what I have found.

➜ launchctl limit
cpu         unlimited      unlimited
filesize    unlimited      unlimited
data        unlimited      unlimited
stack       8388608        67104768
core        0              unlimited
memlock     unlimited      unlimited
maxproc     2784           4176
maxfiles    256            unlimited


It hit the max number of file descriptors in the system. The Too many open files error was thrown by td-agent around the time.

2020-11-11 12:06:17 +0900 [warn]: #0 [forward_input] thread exited by unexpected error plugin=Fluent::Plugin::ForwardInput title=:event_loop error_class=Errno::EMFILE error="Too many open files - accept(2)"
...
/Users/sasaki/.rbenv/versions/2.5.4/lib/ruby/2.5.0/socket.rb:1313:in __accept_nonblock': Too many open files - accept(2) (Errno::EMFILE)


Initializing the fluent-logger object opens a new file descriptor, which caused the Too many open files error. It looks like an event library Cool.io creates a file descriptor when it starts the event loop.

def start
super
# event loop does not run here, so mutex lock is not required
begin
default_watcher = DefaultWatcher.new
event_loop_attach(default_watcher)
@_event_loop_running = true
@_event_loop.run(@_event_loop_run_timeout) # this method blocks
ensure
@_event_loop_running = false
end
end
end


Since we cannot have any privilege to modify the Cool.io codebase quickly, what we can do to the maximum is avoiding the recreation of fluent-logger instance. In this case, the setup_fluent_logger method looks as follows. Note that we use ||= to initialize @log instead of = assignment.

class OurController < ApplicationController
def setup_fluent_logger
@log ||= Fluent::Logger::FluentLogger.new(
nil,
host: 'localhost',
port: 24_224
)
end
end


In general, we cannot guarantee the order of the call of the method of before_action. We should have made @log` thread-safe. I would stress that that problem does not devote to fluentd or Rails itself. Our practice of programming causes that. I have gotten another lesson here.