microsoft/TypeAgent

Public

mirrored fromhttps://github.com/microsoft/TypeAgentAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
f46fff4e5103217703b51e27ba3f6405ac000e21

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

python/stt/whisperService/README.md

48lines · modepreview

# Local whisper service

This project runs a [Faster Whisper](https://github.com/SYSTRAN/faster-whisper) model locally, exposing a local REST endpoint.
You can launch the service by running e.g. `python faster-whisper.py`.

## Setup

### Prerequisites

- [Python 3](https://wiki.python.org/moin/BeginnersGuide/Download)
- [Pip](https://pip.pypa.io/en/stable/installation/)
- **Windows**
  - [FFMpeg](https://www.gyan.dev/ffmpeg/builds/) - you can run `winget install ffmpeg` to install the package
- **Linux**
  - `apt-get -y install python3-pyaudio portaudio19-dev` - needed to install `pyaudio`
  - `apt-get -y install ffmpeg libavcodec-extra` - needed for pydub
  - `apt-get -y install cudnn9-cuda-12`- Follow instruction in [NVIDIA cuDNN on Linux](https://docs.nvidia.com/deeplearning/cudnn/latest/installation/linux.html) to enable NVIDIA network repository.
- **MacOS**
  - `brew install portaudio` - needed to install `pyaudio`

### Install

Option 1: Batch file

- Windows:
  - Run [./setup.cmd](./setup.cmd)
- MacOS/Linux:
  - Run `source setup.sh`

Option 2: Manual steps

- Create and activate a python virtual environment.
- `pip config --site set global.extra-index-url https://download.pytorch.org/whl/cu121`
- `pip install -r requirements.txt`

### Verify Setup

Test if the installation worked by starting the backend `python faster-whisper.py`.

## Connecting to the running service

You can connect to the whisper service using the example "whisperClient" project in the `ts/examples` folder. To use it:

- Go to the repo's [ts/examples/whisperClient](../../../ts/examples/whisperClient/) folder
- Build the project using `pnpm run build`
- Start the web UI using `pnpm run start`

This web client will capture audio from microphone, send to the local service for transcription and show the result.