diff --git a/python/tool-speechtotext/README.md b/python/tool-speechtotext/README.md new file mode 100644 index 0000000..12bfa59 --- /dev/null +++ b/python/tool-speechtotext/README.md @@ -0,0 +1,19 @@ + +# Purpose +speech to text command line utility by leveraging off ollama a local speech-to-text model + +## Setup + +```bash +# Create the environment with Python 3.10 and CUDA toolkit +mamba create -n whisper-ollama python=3.10 nvidia/label/cuda-12.2.0::cuda-toolkit cudnn -c nvidia -c conda-forge -y + +# Activate the environment +mamba activate whisper-ollama + +# Install Audio and Logic dependencies +# Note: portaudio is required for sounddevice to work on Linux +sudo apt-get update && sudo apt-get install libportaudio2 -y + +pip install faster-whisper sounddevice numpy pyperclip requests +```