* chore: add the ability of lru cache for api v3 to improve the inference speed when exchange model weights
* chore: Dockerfile start api v3
* chore: api default port from 127.0.0.1 to 0.0.0.0
* chore: make gpu happy when do tts
* chore: rollback Dockerfile
* chore: fix
* chore: fix
---------
Co-authored-by: kevin.zhang <kevin.zhang@cardinfolink.com>