speechmatics.models¶
Data models and message types used by the library.
- class speechmatics.models._TranscriptionConfig(language=None, **kwargs)[source]¶
Base model for defining transcription parameters.
- additional_vocab: dict = None¶
Additional vocabulary that is not part of the standard language.
- diarization: str = None¶
Indicates type of diarization to use, if any.
- domain: str = None¶
Optionally request a language pack optimized for a specific domain, e.g. ‘finance’
- enable_entities: bool = None¶
Indicates if inverse text normalization entity output is enabled.
- language: str = 'en'¶
ISO 639-1 language code. eg. en
- operating_point: str = None¶
Specifies which acoustic model to use.
- output_locale: str = None¶
RFC-5646 language code for transcript output. eg. en-AU
- punctuation_overrides: dict = None¶
Permitted puctuation marks for advanced punctuation.
- class speechmatics.models.AudioSettings(encoding: Optional[str] = None, sample_rate: int = 44100, chunk_size: int = 4096)[source]¶
Real-time: Defines audio parameters.
- chunk_size: int = 4096¶
Chunk size.
- encoding: str = None¶
Encoding format when raw audio is used. Allowed values are pcm_f32le, pcm_s16le and mulaw.
- sample_rate: int = 44100¶
Sampling rate in hertz.
- class speechmatics.models.BatchConnectionSettings(url: str, message_buffer_size: int = 512, ssl_context: ssl.SSLContext = <factory>, semaphore_timeout_seconds: float = 120, ping_timeout_seconds: float = 60, auth_token: Union[str, NoneType] = None, generate_temp_token: Union[bool, NoneType] = False)[source]¶
- class speechmatics.models.BatchLanguageIdentificationConfig(expected_languages: Optional[List[str]] = None)[source]¶
Batch mode: Language identification config.
- expected_languages: List[str] = None¶
Expected languages for language identification
- class speechmatics.models.BatchSpeakerDiarizationConfig(speaker_sensitivity: Optional[float] = None)[source]¶
Batch mode: Speaker diarization config.
- speaker_sensitivity: float = None¶
The sensitivity of the speaker detection. This is a number between 0 and 1, where 0 means least sensitive and 1 means most sensitive.
- class speechmatics.models.BatchTranscriptionConfig(language=None, **kwargs)[source]¶
Batch: Defines transcription parameters for batch requests. The .as_config() method will return it wrapped into a Speechmatics json config.
- channel_diarization_labels: List[str] = None¶
Add your own speaker or channel labels to the transcript
- fetch_data: speechmatics.models.FetchData = None¶
Optional configuration for fetching file for transcription.
- language_identification_config: speechmatics.models.BatchLanguageIdentificationConfig = None¶
Optional configuration for language identification.
- notification_config: speechmatics.models.NotificationConfig = None¶
Optional configuration for callback notification.
- sentiment_analysis_config: Optional[speechmatics.models.SentimentAnalysisConfig] = None¶
Optional configuration for sentiment analysis of the transcript
- speaker_diarization_config: speechmatics.models.BatchSpeakerDiarizationConfig = None¶
The sensitivity of the speaker detection.
- srt_overrides: speechmatics.models.SRTOverrides = None¶
Optional configuration for SRT output.
- summarization_config: speechmatics.models.SummarizationConfig = None¶
Optional configuration for transcript summarization.
- topic_detection_config: Optional[speechmatics.models.TopicDetectionConfig] = None¶
Optional configuration for detecting topics of the transcript
- translation_config: speechmatics.models.TranslationConfig = None¶
Optional configuration for translation.
- class speechmatics.models.BatchTranslationConfig(target_languages: Optional[List[str]] = None)[source]¶
Batch mode: Translation config.
- class speechmatics.models.ClientMessageType(value)[source]¶
Real-time: Defines various messages sent from client to server.
- AddAudio = 'AddAudio'¶
Adds more audio data to the recognition job. The server confirms receipt by sending an
ServerMessageType.AudioAdded
message.
- EndOfStream = 'EndOfStream'¶
Indicates that the client has no more audio to send.
- SetRecognitionConfig = 'SetRecognitionConfig'¶
Allows the client to re-configure the recognition session.
- StartRecognition = 'StartRecognition'¶
Initiates a recognition job based on configuration set previously.
- class speechmatics.models.ConnectionSettings(url: str, message_buffer_size: int = 512, ssl_context: ssl.SSLContext = <factory>, semaphore_timeout_seconds: float = 120, ping_timeout_seconds: float = 60, auth_token: typing.Optional[str] = None, generate_temp_token: typing.Optional[bool] = False)[source]¶
Defines connection parameters.
- auth_token: Optional[str] = None¶
auth token to authenticate a customer.
- generate_temp_token: Optional[bool] = False¶
Automatically generate a temporary token for authentication. Non-enterprise customers must set this to True. Enterprise customers should set this to False.
- message_buffer_size: int = 512¶
Message buffer size in bytes.
- ping_timeout_seconds: float = 60¶
Ping-pong timeout in seconds.
- semaphore_timeout_seconds: float = 120¶
Semaphore timeout in seconds.
- ssl_context: ssl.SSLContext¶
SSL context.
- url: str¶
Websocket server endpoint.
- class speechmatics.models.FetchData(url: str, auth_headers: Optional[str] = None)[source]¶
Batch: Optional configuration for fetching file for transcription.
- auth_headers: str = None¶
A list of additional headers to be added to the input fetch request when using http or https. This is intended to support authentication or authorization, for example by supplying an OAuth2 bearer token
- url: str¶
URL to fetch
- class speechmatics.models.NotificationConfig(url: str, contents: Optional[List[str]] = None, method: str = 'post', auth_headers: Optional[List[str]] = None)[source]¶
Batch: Optional configuration for callback notification.
- auth_headers: List[str] = None¶
A list of additional headers to be added to the notification request when using http or https. This is intended to support authentication or authorization, for example by supplying an OAuth2 bearer token
- contents: List[str] = None¶
Specifies a list of items to be attached to the notification message. When multiple items are requested, they are included as named file attachments.
- method: str = 'post'¶
The HTTP(S) method to be used. Only post and put are supported.
- url: str¶
URL for notification. The id and status query parameters will be added.
- class speechmatics.models.RTConnectionSettings(url: str, message_buffer_size: int = 512, ssl_context: ssl.SSLContext = <factory>, semaphore_timeout_seconds: float = 120, ping_timeout_seconds: float = 60, auth_token: Union[str, NoneType] = None, generate_temp_token: Union[bool, NoneType] = False)[source]¶
- class speechmatics.models.RTSpeakerDiarizationConfig(max_speakers: Optional[int] = None)[source]¶
Real-time mode: Speaker diarization config.
- max_speakers: int = None¶
This enforces the maximum number of speakers allowed in a single audio stream.
- class speechmatics.models.RTTranslationConfig(target_languages: Optional[List[str]] = None, enable_partials: bool = False)[source]¶
Real-time mode: Translation config.
- enable_partials: bool = False¶
Indicates if partial translation, where sentences are produced immediately, is enabled.
- class speechmatics.models.SRTOverrides(max_line_length: int = 37, max_lines: int = 2)[source]¶
Batch: Optional configuration for SRT output.
- max_line_length: int = 37¶
Maximum count of characters per subtitle line including white space
- max_lines: int = 2¶
Sets maximum count of lines in a subtitle section
- class speechmatics.models.ServerMessageType(value)[source]¶
Real-time: Defines various message types sent from server to client.
- AddPartialTranscript = 'AddPartialTranscript'¶
Indicates a partial transcript, which is an incomplete transcript that is immediately produced and may change as more context becomes available.
- AddPartialTranslation = 'AddPartialTranslation'¶
Indicates a partial translation, which is an incomplete translation that is immediately produced and may change as more context becomes available.
- AddTranscript = 'AddTranscript'¶
Indicates the final transcript of a part of the audio.
- AddTranslation = 'AddTranslation'¶
Indicates the final translation of a part of the audio.
- AudioAdded = 'AudioAdded'¶
Server response to
ClientMessageType.AddAudio
, indicating that audio has been added successfully.
- EndOfTranscript = 'EndOfTranscript'¶
Server response to
ClientMessageType.EndOfStream
, after the server has finished sending allAddTranscript
messages.
- Error = 'Error'¶
Indicates n generic error message.
- Info = 'Info'¶
Indicates a generic info message.
- RecognitionStarted = 'RecognitionStarted'¶
Server response to
ClientMessageType.StartRecognition
, acknowledging that a recognition session has started.
- Warning = 'Warning'¶
Indicates a generic warning message.
- class speechmatics.models.SummarizationConfig(content_type: Literal['informative', 'conversational', 'auto'] = 'auto', summary_length: Literal['brief', 'detailed'] = 'brief', summary_type: Literal['paragraphs', 'bullets'] = 'bullets')[source]¶
Defines summarization parameters.
- content_type: Literal['informative', 'conversational', 'auto'] = 'auto'¶
Optional summarization content_type parameter.
- summary_length: Literal['brief', 'detailed'] = 'brief'¶
Optional summarization summary_length parameter.
- summary_type: Literal['paragraphs', 'bullets'] = 'bullets'¶
Optional summarization summary_type parameter.
- class speechmatics.models.TopicDetectionConfig(topics: Optional[List[str]] = None)[source]¶
Defines topic detection parameters.
- topics: List[str] = None¶
Optional list of topics for topic detection.
- class speechmatics.models.TranscriptionConfig(language=None, **kwargs)[source]¶
Real-time: Defines transcription parameters. The .as_config() method removes translation_config and returns it wrapped into a Speechmatics json config.
- ctrl: dict = None¶
Internal Speechmatics flag that allows to give special commands to the engine.
- enable_partials: bool = None¶
Indicates if partials for both transcripts and translation, where words are produced immediately, is enabled.
- enable_transcription_partials: bool = None¶
Indicates if partial transcripts, where words are produced immediately, is enabled.
- enable_translation_partials: bool = None¶
Indicates if partial translation, where words are produced immediately, is enabled.
- max_delay: float = None¶
Maximum acceptable delay.
- max_delay_mode: str = None¶
Determines whether the threshold specified in max_delay can be exceeded if a potential entity is detected. Flexible means if a potential entity is detected, then the max_delay can be overriden until the end of that entity. Fixed means that max_delay specified ignores any potential entity that would not be completed within that threshold.
- speaker_change_sensitivity: float = None¶
Sensitivity level for speaker change.
- speaker_diarization_config: speechmatics.models.RTSpeakerDiarizationConfig = None¶
Configuration for speaker diarization.
- streaming_mode: bool = None¶
Indicates if we run the engine in streaming mode, or regular RT mode.
- translation_config: speechmatics.models.TranslationConfig = None¶
Optional configuration for translation.