I assume the problem is the AMD OpenCL driver because I can duplicate the bug only on a system with AMD boards. Does not occur on NVidia systems. Looked into the problem when I found I was occasionally running out of disk space for no good reason. I went to /var/log and looked at syslog. First thing I noticed were the huge log size. Grep'ed for "error" and found 1,000,000's of errors all the same:
Jan 23 23:07:17 jysdualxeon kernel: [ 381.590086] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
A reboot stops the errors from being written, other than everything being slow, the system seems to be working ok.
I can duplicate the error as follows: With a program running an OpenCL app, I run the tool "clinfo". It may have to be run as many as 3 times before the bug kicks in. Normally clinfo takes 1-2 seconds to complete. With another app using OpenCL, the clinfo programs hangs. It make take 2 minutes to complete and give a report. There are no problems in the report, just takes too long generate it. Looking at syslog I see it increasing by huge amouns in only seconds
-rw-r----- 1 syslog adm 15171032 Jan 23 23:08 syslog
-rw-r----- 1 syslog adm 22775421 Jan 23 23:13 syslog
-rw-r----- 1 syslog adm 29911745 Jan 23 23:14 syslog
-rw-r----- 1 syslog adm 30435769 Jan 23 23:14 syslog
-rw-r----- 1 syslog adm 30711297 Jan 23 23:15 syslog
With my small 128gb sdd, the system dies in a few days unless I delete the files
[EDIT] also, the app that was actually using OpenCL dies when this happens.