Fuse crash with FuseWriteCache

Hello,

I would like to report one bug we met on Citrine. Recently EOSD sometimes crashed which is difficult to reproduce. After we updated to version 4.2.22, it still occurred. Today I checked the coredump file that showed:
#0 0x000000000050326e in typeinfo for filesystem ()
#1 0x0000000000482991 in WaitAsyncIO (this=0x7f5a4a249310) at /usr/src/debug/eos-4.2.22-1/fuse/FuseCache/LayoutWrapper.hh:209
#2 FileAbstraction::WaitFinishWrites (this=0x7f5a4a249310) at /usr/src/debug/eos-4.2.22-1/fuse/FuseCache/FileAbstraction.cc:222
#3 0x000000000047eb42 in FuseWriteCache::ForceAllWrites (this=0x7f5a64c96f00, fabst=0x7f5a4a249310, wait=Unhandled dwarf expression opcode 0xf3
)
at /usr/src/debug/eos-4.2.22-1/fuse/FuseCache/FuseWriteCache.cc:320
#4 0x0000000000459bd3 in filesystem::close (this=0x749be8, fildes=6, inode=14590874977566720, uid=61, gid=140, pid=30237)
at /usr/src/debug/eos-4.2.22-1/fuse/filesystem.cc:3588
#5 0x0000000000442b00 in EosFuse::release (req=0x7f5a4dc6f280, ino=14590874977566720, fi=0x7f5a4c3fec80)
at /usr/src/debug/eos-4.2.22-1/fuse/eosfuse.cc:1535
#6 0x0000003d46814ed2 in ?? () from /lib64/libfuse.so.2
#7 0x0000003d468120ef in ?? () from /lib64/libfuse.so.2
#8 0x00000038eea07aa1 in start_thread () from /lib64/libpthread.so.0
#9 0x00000038ee2e8bcd in clone () from /lib64/libc.so.6

Cheers, yaodong@IHEP

The bug still exists in Version 4.2.4, and does any one meet the problem in other site? Thanks.

#0 0x0000000000503d1e in typeinfo for filesystem ()_
#1 0x0000000000482851 in WaitAsyncIO (this=0x7ff50ac9a210) at /usr/src/debug/eos-4.2.24-1/fuse/FuseCache/LayoutWrapper.hh:209_
#2 FileAbstraction::WaitFinishWrites (this=0x7ff50ac9a210) at /usr/src/debug/eos-4.2.24-1/fuse/FuseCache/FileAbstraction.cc:222_
#3 0x000000000047e942 in FuseWriteCache::ForceAllWrites (this=0x7ff527f9c200, fabst=0x7ff50ac9a210, wait=Unhandled dwarf expression opcode 0xf3) at /usr/src/debug/eos-4.2.24-1/fuse/FuseCache/FuseWriteCache.cc:320_
#4 0x0000000000458178 in filesystem::truncate (this=0x749be8, fildes=Unhandled dwarf expression opcode 0xf3) at /usr/src/debug/eos-4.2.24-1/fuse/filesystem.cc:3721
#5 0x000000000046a1bc in filesystem::truncate2 (this=0x749be8, fullpath=0x7ff50ac7c528 <Address 0x7ff50ac7c528 out of bounds>, inode=Unhandled dwarf expression opcode 0xf3)at /usr/src/debug/eos-4.2.24-1/fuse/filesystem.cc:3767
#6 0x0000000000439e5d in EosFuse::setattr (req=0x7ff504bff900, ino=73406963509100544, attr=0x7ff4ea7febf0, to_set=8, fi=Unhandled dwarf expression opcode 0xf3) at /usr/src/debug/eos-4.2.24-1/fuse/eosfuse.cc:455

pretty sure we’ve seen this also, currently running 4.2.24 - we ended up just turning the fuse cache off to avoid it.

i have seen other people reporting the crash though - not sure if it’s been resolved in later versions?

I have tried to turn off fuse cache by setting EOS_FUSE_CACHE=0, but it seemed not to work. Could you please tell me how you did it? Thank you very much.

sorry!! i totally forgot to reply -

our eosd config has:

export EOS_FUSE_KERNELCACHE=0
export EOS_FUSE_CACHE=0

i hope that helps!