Ticket #172 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

SRT Subitles too small on AppleTV + Wrong character set used for non-english text

Reported by: cynick Assigned to: astrange
Priority: normal Milestone: 1.1
Component: Subtitles Version: 1.0b2
Severity: normal Keywords:
Cc:

Description

While playing a movie with the attached subtitles, russian characters are used for non-english letters instead of czech/slovak. When played with MovieTime? with "Central European (Windows Latin 2)" encoding, the characters were correct.

Second issue addresses the font size of the subitiles on AppleTV (connected to a 32" TV), it is too small when compared to font size on a iMac 24" (played using QuickTime player).

Attachments

subtitles.srt (40.2 kB) - added by cynick on 04/28/07 11:25:20.
lnraphc.srt (106.7 kB) - added by anonymous on 05/03/07 12:12:38.
this file is not displayed with perian beta2
csdet (13.1 kB) - added by astrange on 05/08/07 23:16:07.
Experimental winlatin1/winlatin2 detector
csdet.c (2.2 kB) - added by astrange on 05/08/07 23:18:01.
actual source
subs-win-1250.srt (48.9 kB) - added by cynick on 05/14/07 10:25:29.

Change History

04/28/07 11:25:20 changed by cynick

  • attachment subtitles.srt added.

04/28/07 14:30:37 changed by nemeseri

The hungarian subtitles (iso-8859-2, Latin 2) still not displayed correctly. The special "ő" and "ű" characters displayed with the "õ" and "û" (iso-8859-1, Latin 1) characters.

04/28/07 16:10:07 changed by astrange

SBCS Group Prober --------begin status

SBCS: 0.140 [windows-1251] SBCS: 0.000 [KOI8-R] SBCS: 0.000 [ISO-8859-5] SBCS: 0.108 [x-mac-cyrillic] SBCS: 0.000 [IBM866] SBCS: 0.000 [IBM855] SBCS: 0.006 [ISO-8859-7] SBCS: 0.006 [windows-1253] SBCS: 0.000 [ISO-8859-5] SBCS: 0.074 [windows-1251] HEB: 144 - 0 [Logical-Visual score] inactive: [windows-1255] (i.e. confidence is too low). inactive: [windows-1255] (i.e. confidence is too low).

SBCS Group found best match [windows-1251] confidence 0.139692. Latin1Prober: 0.010 [windows-1252]

I think I'll probably file it as a bug at Mozilla.

04/29/07 11:56:28 changed by cynick

Font size problem solved after copying /System/Library/Fonts/Helv* files to the same location on ATV

04/29/07 17:13:55 changed by astrange

Is this file valid?

"281

00:20:20,780 --> 00:20:22,740

a se nezhroutím. "

There's an illegal control character after the 'a'.

iconv: subtitles.srt:1252:1: cannot convert

04/30/07 06:37:01 changed by cynick

The file should be OK.
Enca output:
$> enca -L cs subtitles.srt
MS-Windows code page 1250

Mixed line terminators

the .srt snippet from above (with correct encoding) should look like:
281
00:20:20,780 --> 00:20:22,740
ať se nezhroutím.

05/02/07 09:42:37 changed by tick

  • milestone set to 1.1.

The font size should be fixed in b3. Moving this to 1.1 for the character set issue.

05/02/07 23:13:57 changed by astrange

Oh, well, it doesn't have a 1250 detector at all... I guess I need to find a larger text corpus.

05/03/07 12:12:38 changed by anonymous

  • attachment lnraphc.srt added.

this file is not displayed with perian beta2

05/05/07 13:52:51 changed by astrange

(In [495]) Hack to possibly recognize some Latin-2 subtitles better. Refs #172

05/05/07 14:41:56 changed by astrange

05/06/07 01:19:08 changed by dconrad

(In [497]) Use double newline as the separator between subtitle lines in .srt rather than looking for the next increasing number. Refs #172

05/08/07 23:16:07 changed by astrange

  • attachment csdet added.

Experimental winlatin1/winlatin2 detector

05/08/07 23:17:25 changed by astrange

  • owner set to astrange.

Can you test that against some subtitle files and see if it works? If something isn't detected, post it here.

I used French and German for latin1, so other languages might not work well.

05/08/07 23:18:01 changed by astrange

  • attachment csdet.c added.

actual source

05/14/07 10:25:02 changed by cynick

Hi
I tested the csdet.c with about 250 win-1250 coded subtitle files, all but one were identified correctly (see attached subs-win-1250.srt ).

$ enca -L cs subs-win-1250.srt
MS-Windows code page 1250

Mixed line terminators

$ ./csdet subs-win-1250.srt
subs-win-1250.srt : latin-1 @ -1921446

05/14/07 10:25:29 changed by cynick

  • attachment subs-win-1250.srt added.

05/14/07 20:49:42 changed by rickvug

I can confirm that the exact file that is needed is "Helvetica.otf".

05/29/07 10:22:39 changed by anonymous

I can't find Helvetica.otf on my mac, only Helvetica.dfont

07/04/07 11:41:06 changed by loyalty_anchored

I tested moving /System/Library/Fonts/Helvetica.dfont from my intel mac to the same path on my AppleTV and the subtitles are now being displayed in a reasonable size.

07/06/07 01:34:23 changed by astrange

  • status changed from new to closed.
  • resolution set to fixed.

This bug is too general; I'm closing it. I'll work on the failing sample, though.

09/06/07 14:23:08 changed by astrange

(In [699]) If we guess one of latin1/2 for a file and it's invalid, try the other one. Refs #172.