[Windows] Line endings in subtitle files

Any questions? Need some help?
Post Reply
hlhl
Posts: 7
Joined: 15 May 2016, 14:39

[Windows] Line endings in subtitle files

Post by hlhl »

Hello,

First of all, I just want to say what an amazing program filebot is. I was looking for this kind of program for a while.

But now my small problem. I use the CLI version of the program to only download subtitles for series I download via qBitTorrent.
The code I use for this is as follows

Code: Select all

filebot -script fn:suball /Users/mitch/Downloads --output srt -non-strict --def maxAgeDays=7 --format "Match Video"
It works great and also matches the video filename which is needed for my tv.
Only when I am using the subtitles, some words are somehow merged together. This happens both on my tv as on my pc because it is in the srt file.
I tried to look on forums to find a solution but haven't found anything yet. I also tried to change the format of the srt file but that didn't do anything.

For example:
When I download a srt file via filebot I get this:

Code: Select all


11
00:02:32,532 --> 00:02:34,749
I was with youat Hardhome.

12
00:02:36,035 --> 00:02:38,369
We sawwhat's out there.
But when i directly download from opensubtitles i get this:

Code: Select all

7
00:02:32,532 --> 00:02:34,749
I was with you
at Hardhome.

8
00:02:36,035 --> 00:02:38,369
We saw
what's out there.
I don't know why this happens, but it happens with every srt I download via filebot. Can someone help me with this?

Thanks, Mitch

Extra information:

Code: Select all

C:\Users\mitch>filebot -script fn:sysinfo
FileBot 4.7 (r3923)
JNA Native: 4.0.1
MediaInfo: MediaInfoLib - v0.7.78
7-Zip-JBinding: 9.20
Chromaprint: fpcalc version 1.1.0 (C:\Program Files\FileBot\fpcalc.exe)
Extended Attributes: OK
Groovy Engine: 2.4.6
JRE: Java(TM) SE Runtime Environment 1.8.0_91
JVM: 64-bit Java HotSpot(TM) 64-Bit Server VM
CPU/MEM: 4 Core / 1 GB Max Memory / 20 MB Used Memory
OS: Windows 10 (amd64)
Package: MSI
Data: C:\Users\mitch\AppData\Roaming\FileBot
Done ?(?????)?
User avatar
rednoah
The Source
Posts: 23930
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Different subtitle structure via script

Post by rednoah »

Try removing --output srt and see if that makes a difference. Maybe there's something wrong with transcoding subtitle formats.

An OpenSubtitles link (or subtitle id) for the subtitle in question would be helpful.
:idea: Please read the FAQ and How to Request Help.
hlhl
Posts: 7
Joined: 15 May 2016, 14:39

Re: Different subtitle structure via script

Post by hlhl »

Thank you for your quick reply.

I tried to remove the output srt code but the result remained the same. Also tried using Powershell instead of CMD but that also didn't do anything.
Secondly, I have this problem with every tv serie or movie I download, so a single opensubtitle id/link doesn't matter i guess?

One thing I did notice was the difference opening the srt file in wordpad instead of notepad.

Code: Select all

(Latest Game of thrones episode)
In Notepad:
00:02:42,749 --> 00:02:46,372
When I heard you had escapedWinterfell, I feared the worst.

In Wordpad:
00:02:42,749 --> 00:02:46,372
When I heard you had escaped
Winterfell, I feared the worst.
So apparently there is an space(or "Enter") between escapedWinterfell which notepad and my tv does not recognize but wordpad does
User avatar
rednoah
The Source
Posts: 23930
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Different subtitle structure via script

Post by rednoah »

I'd start by looking at it with a hex editor to see what exact bytes / characters are there.
:idea: Please read the FAQ and How to Request Help.
hlhl
Posts: 7
Joined: 15 May 2016, 14:39

Re: Different subtitle structure via script

Post by hlhl »

Image

So apparently "0A" causes the problem and a normal space "20"does not.
I do not know why because I have never worked with a hex editor.
User avatar
rednoah
The Source
Posts: 23930
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Different subtitle structure via script

Post by rednoah »

0x0A is the NEWLINE or \n character:
http://unicode-table.com/en/000A/

0x20 is the SPACE character:
http://unicode-table.com/en/0020/

\n (LF) is called Unix-style line endings. On Windows it's traditionally \r\n (CR+LF) but any Windows program that doesn't accept both line ending styles must be pretty shitty. Use a different editor like Notepad++ and a different media player like VLC.

@see https://en.wikipedia.org/wiki/Newline
:idea: Please read the FAQ and How to Request Help.
hlhl
Posts: 7
Joined: 15 May 2016, 14:39

Re: [Windows] Line endings in subtitle files

Post by hlhl »

It indeed does work in VLC. Unfortunately I am not going to buy a new tv in the near future.

Is it possible to change the encoding? I tried to use --encoding Windows-1252 but it still uses UTF-8

Thanks
User avatar
rednoah
The Source
Posts: 23930
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Windows] Line endings in subtitle files

Post by rednoah »

Character encoding has no effect on Windows VS Unix line separators. Replacing all \r\n with \n or vice versa is like one line of code though and I'm sure there's many utilities that can take care of this.

@see http://stackoverflow.com/questions/1757 ... ne-endings
:idea: Please read the FAQ and How to Request Help.
hlhl
Posts: 7
Joined: 15 May 2016, 14:39

Re: [Windows] Line endings in subtitle files

Post by hlhl »

Yes, the script now works!! :D

For anyone interested, this is the batch file I use:

Code: Select all

@echo off
filebot -script fn:suball /Users/mitch/Downloads --output srt -non-strict --def maxAgeDays=7 --format "Match Video"
for /R C:\Users\mitch\Downloads %%G IN (*.srt) do C:\Subs\unix2dos.exe "%%G"


Again, thanks for your time, rednoah.
I will donate a few bucks for your support.

Mitch
Post Reply