Quantcast
Channel: filetypes – multifarious
Viewing all articles
Browse latest Browse all 49

Getting a filetype preview…

$
0
0

001One of my favourite features in Studio 2017 is the filetype preview.  The time it can save when you are creating custom filetypes comes from the fun in using it.  I can fill out all the rules and switch between the preview and the rules editor without having to continually close the options, open the file, see if it worked and then close the file and go back to the options again… then repeat from the start… again… and again…   I guess it’s the little things that keep us happy!

I decided to look at this using a YAML file as this seems to be coming up quite a bit recently.  YAML, pronounced “Camel”, stands for “YAML Ain’t Markup Language” and I believe it’s a superset of the JSON format, but with the goal of making it more human readable.  The specification for YAML is here, YAML Specification, and to do a really thorough job I guess I could try and follow the rules set out.  But in practice I’ve found that creating a simple Regular Expression Delimited Text filetype based on the sample files I’ve seen has been the key to handling this format.  Looking ahead I think it would be useful to see a filetype created either as a plugin through the SDL AppStore, or within the core product just to make it easier for users not comfortable with creating their own filetypes.  But I digress…

Example YAML file

I’ve seen a few variants on YAML already but the basic principle for our needs (translatable text extraction) is very similar to JSON in that the text is held in constructs known as “scalars”.

 blog_title: "Bridging the divide, merging segments"
 blog_keywords: merging, paragraph breaks, SDL Trados Studio
 blog_ref: 'For more information <a href=%{info_link}>[Click Here]</a>'

It’s apparently acceptable for the scalers to be contained within single quotes, double quotes or no quotes at all and so far I have examples of each of these.  In fact the sample above uses all three variants.  Reading the specification tells me that it’s also possible to have them in a folded form denoted with >, but I have not come across an example like this yet.  So typically supporting YAML using a custom regex based filetype to suit the examples provided by customers has been trivial.  I can get at the translatable text within the scaler using document structure rules like this (opening pattern followed by closing pattern):

^.*?:\s"
"$

Or this:

^.*?:\s
$

Or this

^.*?:\s'
'$

But then I occasionally came across a file where both single quotes and double quotes were used in the same file… so I added a non-capturing group, “(?:)“, offering the alternatives through the use of the pipe symbol “|“:

^.*?:\s(?:'|")
(?:'|")$

I didn’t come across a file that used no quotes at all, and also combination of quotes… but here’s how I could handle that eventuality:

^.*?:\s(?:'|"|)
(?:'|"|)$

So for the sort of files I’ve come across so far these last pair of opening and closing document structure rules would do the job.  I guess I could have gone straight to this final set, but I thought it might be interesting for anyone playing around with regex for the first time see the iterations.  It also gives you an idea of the sort of  testing you might go through in getting the filetype right… it can be a lot of to-ing and fro-ing.

The other interesting thing about YAML is that it can contain complete html files, or just text marked up with html, or even marked up with script.  The last scalar in my example contains translatable text containing html markup and script.  So I can handle these using Inline tags in Studio and just convert any markup to protected tags.  This is where the purpose of this article really comes in… using the new preview capability.

Preview File

When I go to my File Types options in Studio now I see this at the bottom of the screen:

002

This little addition means I can browse for my test file, in this case multifarious.yml, and with one click see whether the rules I’m creating are extracting the correct text, and also converting inline code/markup to tags.  This replaces this sequence of events:

  1. close the options screen
  2. open the test file for translation
  3. review the content
  4. close the test file
  5. open the options again and apply changes
  6. repeat as needed

In fact the process I outlined there is not even the way many translators/engineers did this in the past.  I have seen people not familiar with the single document process creating a new project each time just to test the filetype settings.  So having this one click preview is a serious timesaver if you are responsible for creating filetypes in your organisation, or even if you do the occasional one and find it requires a lot of to-ing and fro-ing to get it right.  The preview itself is very neat and concise… in my example it looks a little like this:

003

An important point to note is that you can use this feature to check the effects of the settings for any file supported by Studio.  So this is not just a tool for geeky regex loving translators/project managers… it’s really good for preparing files of any kind.

Video

But, I thought the best way to demonstrate this would be by a video as this really shows the benefit, and also how to work through my fabricated YAML filetype in detail.

Video: Length is approx. 8 minutes

For me this is one of the best improvements in this release.  It’s a small thing, but as I do create quite a lot of custom filetypes for various types of files and this is a real timesaver.  In fact it’s also an absolute pleasure to use it!!



Viewing all articles
Browse latest Browse all 49

Trending Articles