Ticket #172 (closed defect: fixed)

Opened 3 years ago

Last modified 4 months ago

SRT Subitles too small on AppleTV + Wrong character set used for non-english text

Reported by: cynick Owned by: astrange
Priority: normal Milestone: 1.1
Component: Subtitles Version: 1.0b2
Severity: normal Keywords:
Cc:

Description

While playing a movie with the attached subtitles, russian characters are used for non-english letters instead of czech/slovak. When played with MovieTime? with "Central European (Windows Latin 2)" encoding, the characters were correct.

Second issue addresses the font size of the subitiles on AppleTV (connected to a 32" TV), it is too small when compared to font size on a iMac 24" (played using QuickTime player).

Attachments

subtitles.srt Download (40.2 KB) - added by cynick 3 years ago.
lnraphc.srt Download (106.7 KB) - added by anonymous 3 years ago.
this file is not displayed with perian beta2
csdet Download (13.1 KB) - added by astrange 3 years ago.
Experimental winlatin1/winlatin2 detector
csdet.c Download (2.2 KB) - added by astrange 3 years ago.
actual source
subs-win-1250.srt Download (48.9 KB) - added by cynick 3 years ago.

Change History

Changed 3 years ago by cynick

Changed 3 years ago by nemeseri

The hungarian subtitles (iso-8859-2, Latin 2) still not displayed correctly. The special "ő" and "ű" characters displayed with the "õ" and "û" (iso-8859-1, Latin 1) characters.

Changed 3 years ago by astrange

SBCS Group Prober --------begin status

SBCS: 0.140 [windows-1251] SBCS: 0.000 [KOI8-R] SBCS: 0.000 [ISO-8859-5] SBCS: 0.108 [x-mac-cyrillic] SBCS: 0.000 [IBM866] SBCS: 0.000 [IBM855] SBCS: 0.006 [ISO-8859-7] SBCS: 0.006 [windows-1253] SBCS: 0.000 [ISO-8859-5] SBCS: 0.074 [windows-1251] HEB: 144 - 0 [Logical-Visual score] inactive: [windows-1255] (i.e. confidence is too low). inactive: [windows-1255] (i.e. confidence is too low).

SBCS Group found best match [windows-1251] confidence 0.139692. Latin1Prober: 0.010 [windows-1252]

I think I'll probably file it as a bug at Mozilla.

Changed 3 years ago by cynick

Font size problem solved after copying /System/Library/Fonts/Helv* files to the same location on ATV

Changed 3 years ago by astrange

Is this file valid?

"281

00:20:20,780 --> 00:20:22,740

a se nezhroutím. "

There's an illegal control character after the 'a'.

iconv: subtitles.srt:1252:1: cannot convert

Changed 3 years ago by cynick

The file should be OK.
Enca output:
$> enca -L cs subtitles.srt
MS-Windows code page 1250

Mixed line terminators

the .srt snippet from above (with correct encoding) should look like:
281
00:20:20,780 --> 00:20:22,740
ať se nezhroutím.

Changed 3 years ago by tick

  • milestone set to 1.1

The font size should be fixed in b3. Moving this to 1.1 for the character set issue.

Changed 3 years ago by astrange

Oh, well, it doesn't have a 1250 detector at all... I guess I need to find a larger text corpus.

Changed 3 years ago by anonymous

this file is not displayed with perian beta2

Changed 3 years ago by astrange

(In [495]) Hack to possibly recognize some Latin-2 subtitles better. Refs #172

Changed 3 years ago by dconrad

(In [497]) Use double newline as the separator between subtitle lines in .srt rather than looking for the next increasing number. Refs #172

Changed 3 years ago by astrange

Experimental winlatin1/winlatin2 detector

Changed 3 years ago by astrange

  • owner set to astrange

Can you test that against some subtitle files and see if it works? If something isn't detected, post it here.

I used French and German for latin1, so other languages might not work well.

Changed 3 years ago by astrange

actual source

Changed 3 years ago by cynick

Hi
I tested the csdet.c with about 250 win-1250 coded subtitle files, all but one were identified correctly (see attached subs-win-1250.srt ).

$ enca -L cs subs-win-1250.srt
MS-Windows code page 1250

Mixed line terminators

$ ./csdet subs-win-1250.srt
subs-win-1250.srt : latin-1 @ -1921446

Changed 3 years ago by cynick

Changed 3 years ago by rickvug

I can confirm that the exact file that is needed is "Helvetica.otf".

Changed 3 years ago by anonymous

I can't find Helvetica.otf on my mac, only Helvetica.dfont

Changed 3 years ago by loyalty_anchored

I tested moving /System/Library/Fonts/Helvetica.dfont from my intel mac to the same path on my AppleTV and the subtitles are now being displayed in a reasonable size.

Changed 3 years ago by astrange

  • status changed from new to closed
  • resolution set to fixed

This bug is too general; I'm closing it. I'll work on the failing sample, though.

Changed 3 years ago by astrange

(In [699]) If we guess one of latin1/2 for a file and it's invalid, try the other one. Refs #172.

Note: See TracTickets for help on using tickets.