speechmatics.models¶
Data models and message types used by the library.
- class speechmatics.models._TranscriptionConfig(language=None, **kwargs)[source]¶
Base model for defining transcription parameters.
- additional_vocab: dict = None¶
Additional vocabulary that is not part of the standard language.
- diarization: str = None¶
Indicates type of diarization to use, if any.
- domain: str = None¶
Optionally request a language pack optimized for a specific domain, e.g. ‘finance’
- enable_entities: bool = None¶
Indicates if inverse text normalization entity output is enabled.
- language: str = 'en'¶
ISO 639-1 language code. eg. en
- operating_point: str = None¶
Specifies which acoustic model to use.
- output_locale: str = None¶
RFC-5646 language code for transcript output. eg. en-AU
- punctuation_overrides: dict = None¶
Permitted puctuation marks for advanced punctuation.
- class speechmatics.models.AudioSettings(encoding: Optional[str] = None, sample_rate: int = 44100, chunk_size: int = 4096)[source]¶
Real-time: Defines audio parameters.
- chunk_size: int = 4096¶
Chunk size.
- encoding: str = None¶
Encoding format when raw audio is used. Allowed values are pcm_f32le, pcm_s16le and mulaw.
- sample_rate: int = 44100¶
Sampling rate in hertz.
- class speechmatics.models.BatchSpeakerDiarizationConfig(speaker_sensitivity: Optional[int] = None)[source]¶
Batch mode: Speaker diarization config.
- speaker_sensitivity: int = None¶
The sensitivity of the speaker detection.
- class speechmatics.models.BatchTranscriptionConfig(language=None, **kwargs)[source]¶
Batch: Defines transcription parameters for batch requests. The .as_config() method will return it wrapped into a Speechmatics json config.
- channel_diarization_labels: List[str] = None¶
Add your own speaker or channel labels to the transcript
- fetch_data: speechmatics.models.FetchData = None¶
Optional configuration for fetching file for transcription.
- notification_config: speechmatics.models.NotificationConfig = None¶
Optional configuration for callback notification.
- speaker_diarization_config: speechmatics.models.BatchSpeakerDiarizationConfig = None¶
The sensitivity of the speaker detection.
- srt_overrides: speechmatics.models.SRTOverrides = None¶
Optional configuration for SRT output.
- class speechmatics.models.ClientMessageType(value)[source]¶
Real-time: Defines various messages sent from client to server.
- AddAudio = 'AddAudio'¶
Adds more audio data to the recognition job. The server confirms receipt by sending an
ServerMessageType.AudioAdded
message.
- EndOfStream = 'EndOfStream'¶
Indicates that the client has no more audio to send.
- SetRecognitionConfig = 'SetRecognitionConfig'¶
Allows the client to re-configure the recognition session.
- StartRecognition = 'StartRecognition'¶
Initiates a recognition job based on configuration set previously.
- class speechmatics.models.ConnectionSettings(url: str, message_buffer_size: int = 512, ssl_context: ssl.SSLContext = <factory>, semaphore_timeout_seconds: float = 120, ping_timeout_seconds: float = 60, auth_token: typing.Optional[str] = None, generate_temp_token: typing.Optional[bool] = False)[source]¶
Defines connection parameters.
- auth_token: str = None¶
auth token to authenticate a customer. This auth token is only applicable for RT-SaaS.
- generate_temp_token: Optional[bool] = False¶
Automatically generate a temporary token for authentication. Non-enterprise customers must set this to True. Enterprise customers should set this to False.
- message_buffer_size: int = 512¶
Message buffer size in bytes.
- ping_timeout_seconds: float = 60¶
Ping-pong timeout in seconds.
- semaphore_timeout_seconds: float = 120¶
Semaphore timeout in seconds.
- ssl_context: ssl.SSLContext¶
SSL context.
- url: str¶
Websocket server endpoint.
- class speechmatics.models.FetchData(url: str, auth_headers: Optional[str] = None)[source]¶
Batch: Optional configuration for fetching file for transcription.
- auth_headers: str = None¶
A list of additional headers to be added to the input fetch request when using http or https. This is intended to support authentication or authorization, for example by supplying an OAuth2 bearer token
- url: str¶
URL to fetch
- class speechmatics.models.NotificationConfig(url: str, contents: Optional[str] = None, method: str = 'post', auth_headers: Optional[str] = None)[source]¶
Batch: Optional configuration for callback notification.
- auth_headers: str = None¶
A list of additional headers to be added to the notification request when using http or https. This is intended to support authentication or authorization, for example by supplying an OAuth2 bearer token
- contents: str = None¶
Specifies a list of items to be attached to the notification message. When multiple items are requested, they are included as named file attachments.
- method: str = 'post'¶
The HTTP(S) method to be used. Only post and put are supported.
- url: str¶
URL for notification. The id and status query parameters will be added.
- class speechmatics.models.RTSpeakerDiarizationConfig(max_speakers: Optional[int] = None)[source]¶
Real-time mode: Speaker diarization config.
- max_speakers: int = None¶
This enforces the maximum number of speakers allowed in a single audio stream.
- class speechmatics.models.SRTOverrides(max_line_length: int = 37, max_lines: int = 2)[source]¶
Batch: Optional configuration for SRT output.
- max_line_length: int = 37¶
Maximum count of characters per subtitle line including white space
- max_lines: int = 2¶
Sets maximum count of lines in a subtitle section
- class speechmatics.models.ServerMessageType(value)[source]¶
Real-time: Defines various message types sent from server to client.
- AddPartialTranscript = 'AddPartialTranscript'¶
Indicates a partial transcript, which is an incomplete transcript that is immediately produced and may change as more context becomes available.
- AddTranscript = 'AddTranscript'¶
Indicates the final transcript of a part of the audio.
- AudioAdded = 'AudioAdded'¶
Server response to
ClientMessageType.AddAudio
, indicating that audio has been added successfully.
- EndOfTranscript = 'EndOfTranscript'¶
Server response to
ClientMessageType.EndOfStream
, after the server has finished sending allAddTranscript
messages.
- Error = 'Error'¶
Indicates n generic error message.
- Info = 'Info'¶
Indicates a generic info message.
- RecognitionStarted = 'RecognitionStarted'¶
Server response to
ClientMessageType.StartRecognition
, acknowledging that a recognition session has started.
- Warning = 'Warning'¶
Indicates a generic warning message.
- class speechmatics.models.TranscriptionConfig(language=None, **kwargs)[source]¶
Real-time: Defines transcription parameters.
- enable_partials: bool = None¶
Indicates if partial transcription, where words are produced immediately, is enabled.
- max_delay: float = None¶
Maximum acceptable delay.
- max_delay_mode: str = None¶
Determines whether the threshold specified in max_delay can be exceeded if a potential entity is detected. Flexible means if a potential entity is detected, then the max_delay can be overriden until the end of that entity. Fixed means that max_delay specified ignores any potential entity that would not be completed within that threshold.
- speaker_change_sensitivity: float = None¶
Sensitivity level for speaker change.
- speaker_diarization_config: speechmatics.models.RTSpeakerDiarizationConfig = None¶
Configuration for speaker diarization.