I have been using this inference server successfully, but it suddenly started serving from what seems to be a unix socket, instead on a http server. I installed the repo and flash-attention manually, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results