Subtitle group
Pgs
dataclass
¶
An instance of PGS represent a mapping of a PGS file to Python. The PGS.items property contains all PgsSubtitleItem contained within the specified PGS file.
Parameters¶
tmp_location: str Location of a prior extract .sub PGS file. temp_folder: str Only necessary for debugging. Directory where to dump the metadata. Defaults to "tmp"
items
property
¶
Return PgsSubtitleItems which hold metadata on the image as a PgsImage converted from raw bytes.
Returns¶
list List containing PgsSubtitleItems which will eventual contain the text extracted from the PGS image.
__decode(data)
¶
Decodes the PGS file provided as raw bytes and group the contained DisplaySets into unique PgsSubtitleItems.
Returns¶
list List containing PgsSubtitleItems which will eventual contain the text extracted from the PGS image.
dump_display_sets(display_sets, path='')
¶
Dumps DisplaySets contained in PGS file as .txt and .json
SubtitleGroup
dataclass
¶
Defines an instance of a SubtitleGroup. A SubtitleGroup wraps around N DisplaySets usually defining a START segment with the following segments defining more objects to display until an END segment.
Within a group subtitles can overlap as PGS supports two windows displaying one image each at a time. The current image is displayed until the window is updated with another image.
Parameters¶
members: list List of DisplaySets the SubtitleGroup wraps around.
__find_global_palettes(members)
¶
Grab all Palette defined at either EPOCH_START, ACQUISITION_POINT or intermediate with varying IDs.
Returns¶
list Contains all Palettes found in the global Palette definition at ACQUISITION_POINT or intermediate with varying IDs.
__find_overlap(members)
¶
Check if the current set of DisplaySets between a EPOCH_START, ACQUISITION_POINT & EndSegment have overlapping Windows. In PGS there can be at most two overlapping Windows at a time.
Returns¶
bool Returns TRUE right away if there is any overlap found.
__find_overlapping(reset_positions, redef_positions, members)
¶
Finds the actual DisplaySets which are overlapping starting from each REDEF segment position until the immediately following RESET segment is reached.
Returns¶
dict Contains all DisplaySets that are overlapping grouped by the REDEF segments position.
__find_redefinition_positions(members, reset_pos)
¶
In PGS REDEF segments usually define a new set of Palettes, Windows and CompositionObjects. They also define the number of Windows currently active. REDEF segments usually follow RESET segments as they define new content and their positions about to come after.
This finds these positions.
A START segment is also a valid REDEF segment.
Returns¶
list Contains indices of REDEF segments.
__find_reset_positions(members)
¶
In PGS files END segments are usually sized to 11 bytes, contain no objects & are placed at the end of the group. However, if elements overlap an additional intermediate RESET segment is inserted which drops the number of objects and marks the position a Palette update can happen & either new elements can overlap or the overlap ends.
PGS subtitles can only show 2 objects on screen at once.
This finds these positions.
They seem to always be marked with a size of 19 bytes and dropping the number of segments, which will always be 1.
Returns¶
list Contains the indices of the RESET segments.
__fix_endpoints(fixables, reset_statements, end)
¶
Reprocess dictionary containing TimelineItems displayed in either Top or Bottom window. Since END & RESET segments do not define images within them, they will not be correlated to a specific TimelineItem.
However they define the true end timestamp for the TimelineItem prior, so the items end needs to be extended to match the END / RESET segments display timestamp.
Returns¶
dict Dictionary containing TimelineItems displayed in either Top or Bottom window.
__gen_pgs_subtitle_items(timelines)
¶
Generate PgsSubtitleItems which hold metadata on the image as a PgsImage converted from raw bytes and the matching Palette defined.
Returns¶
list List containing PgsSubtitleItems which will eventual contain the text extracted from the PGS image.
__gen_timelines(members, global_palettes)
¶
Generate timelines. Timelines consist of TimelineItems and describe the changes in either the Top or Bottom window of a PGS file. Items will be grouped as one if they display the same image within the same position and will be treated as new items if a new image is being defined.
Returns¶
dict Dictionary containing TimelineItems displayed in either Top or Bottom window.
__process_timeline_item(new_timeline, timelines, ds, global_palettes)
¶
TimelineItems extracted from PGS subtitles have no correlation to their respective counterparts coming before or after.
Process each item and extract the WindowID they are displayed in. If a prior item already exists within the Timelines dict, check if they are the same item referenced by their ID.
If its a new item, simply add it to the Timelines dict, else update prior items data with current items data where required.
Returns¶
dict Timelines dict once a new item has been processed.
TimelineItem
dataclass
¶
An instance of TimelineItem describes an objects being displayed either a Top or Bottom timeline within a PGS file.
A TimelineItem is effectively the text block being displayed on screen for a set duration.
Parameters¶
start: SubRipTime When the item starts being displayed. ds: DisplaySet DisplaySet associated with this item. end: SubRipTime When the item stops being displayed. window_id: int The Window the item is being displayed in within a PGS file.
duration
property
¶
Provides duration with which a given TimelineItem is being displayed.
Returns¶
SubRipTime Duration with which a given TimelineItem is being displayed.
lang_estimate
property
¶
Contains a list of languages and their probabilities matching the text within a PgsSubtitleItem.
Returns¶
list Language estimation of the text.
text
property
¶
Returns text displayed within this timeline slot in a PGS file.
Returns¶
str Text displayed within this timeline slot in a PGS file.
gen_pgs_subtitle_item()
¶
Generates a PgsSubtitleItem described by the TimelineItems entry. Contains the image and later text / language estimation of the text.
Returns¶
PgsSubtitleItem The PgsSubtitleItem which is displayed within this timeline slot.
set_text(text)
¶
Sets the text displayed within this timeline slot in a PGS file.