The LANG Attribute

How-To HTML showcase

In my previous post I talked about Baybayin – the Forgotten Pre-Hispanic Writing of the Filipino. It was added in version 5.0 of the Unicode Standard together with Buhid, Hanunoo, and Tagbanwa under the “Philippine Scripts” group. But how should we properly write or mark our content written in another language and script?

For this post, I will talk about how to correctly declare the language of your content, this way you are being friendly with translation software and helper applications, and other technologies that rely on this often taken-for-granted HTML attribute. As is shown in our image, everyone can see the writing script used, but in the digital world there are people who do not have the fonts you are using. There are also people who do not use the same browser as you and me use – it could be a text-only browser, a speech browser, or a Braille browser.

It is then only appropriate that we properly and correctly tag our content with the language and script we are using. Get ready to use the LANG attribute a lot.

When creating websites, it is important to properly declare the language being used on the webpage. For example, I use the following for all my sites:

<html lang="en-PH">

It is also important to declare the character set especially when you are going to use any characters beyond the scope of ASCII. I use this for all my sites:

<meta charset="UTF-8" />

Putting it all together, our basic HTML should be.

<html lang="en-PH">
    <meta charset="UTF-8" />
    <meta description="My Website" />
    <meta keywords="Philippines, Baybayin" />
    <title>My Baybayin Website</title>

Now let’s dig-in…

The lang attribute

The HTML lang attribute defines the language of the content enclosed within the element it was declared. The codes are called subtag , and for my Filipino readers, there are only three subtag types you should worry about: language-Script-REGION. The full format: language-extended_language-Script-REGION-variant-extension-privateuse.

See the table below:

Code Language Subtag Placement
en (Generic) English language
en-PH Philippine English language+REGION
fil-Tglg Filipino in Baybayin language+Script
bik-cts-Tglg Bikolano of the Pandan (Northern Catanduanes) dialect in Baybayin script language+extended_language+Script
phi-Tglg-tsg Tausug Philippine language written in Baybayin script language+Script+variant

If you want to find the subtags for a particular language, previously we have to check different websites and plenty of official code lists. A time-consuming task (although normally you only have to do this once), right? Well, the latest official subtags can now be found in the IANA Language Subtag Registry. It is now the universal source for all valid subtags.

So, according to the latest list (as of this writing), the subtags that are related to the Philippines are the following (if I missed anything, please leave a comment below):


  • Tagalog
    • Type: language
    • Subtag: tl
    • Description: Tagalog
    • Added: 2005-10-16
    • Suppress-Script: Latn
  • Bikol
    • Type: language
    • Subtag: bik
    • Description: Bikol
    • Added: 2005-10-16
  • Cebuano
    • Type: language
    • Subtag: ceb
    • Description: Cebuano
    • Added: 2005-10-16
  • Filipino/Pilipino
    • Type: language
    • Subtag: fil
    • Description: Filipino
    • Description: Pilipino
    • Added: 2005-10-16
  • Hiligaynon
    • Type: language
    • Subtag: hil
    • Description: Hiligaynon
    • Added: 2005-10-16
  • Iloko
    • Type: language
    • Subtag: ilo
    • Description: Iloko
    • Added: 2005-10-16
  • Pangasinan
    • Type: language
    • Subtag: pag
    • Description: Pangasinan
    • Added: 2005-10-16
  • Pampanga/Kapampangan
    • Type: language
    • Subtag: pam
    • Description: Pampanga
    • Description: Kapampangan
    • Added: 2005-10-16
  • Philippine languages
    • Type: language
    • Subtag: phi
    • Description: Philippine languages
    • Added: 2005-10-16
  • Waray
    • Type: language
    • Subtag: war
    • Description: Waray
    • Added: 2005-10-16


  • Philippines
    • Type: region
    • Subtag: PH
    • Description: Philippines
    • Added: 2005-10-16


  • Buhid
    • Type: script
    • Subtag: Buhd
    • Description: Buhid
    • Added: 2005-10-16
  • Hanunoo (Hanunóo)
    • Type: script
    • Subtag: Hano
    • Description: Hanunoo (Hanunóo)
    • Added: 2005-10-16
  • Tagbanwa
    • Type: script
    • Subtag: Tagb
    • Description: Tagbanwa
    • Added: 2005-10-16
  • Tagalog
    • Type: script
    • Subtag: Tglg
    • Description: Tagalog
    • Description: Baybayin
    • Description: Alibata
    • Added: 2005-10-16

Now that we have the subtags that we need, we can start coding the correct lang value for any Philippines language and script. See these examples:

  • If you grew and learned English in the Philippines, more likely than not, you are using English words that are exclusive to the Philippines, as well as, following strict language rules that are taught only in Philippine English. This is the correct lang value for your website:
  • If you are writing in Filipino, not Tagalog, use this:
  • If you are writing in Tagalog, not Filipino, use this:
  • For Bikolano, use this:
  • In Cebuano, use this:
  • Hiligaynon, use this:
  • In Iloko, use this:
  • In Pangasinense:
  • Kapampangan:
  • Waray:
  • For Philippines languages with no corresponding ISO-639-2 code, you have to use the generic subtag:

Then if you want to write something in Baybayin script, for example, you have to enclose it correctly with the script subtag “Tglg”. Remember, the format is: language-Script, like so:

  • If writing in Filipino and Tagalog using Baybayin script, use:
  • In Bikolano but Baybayin script:
  • Cebuano using Baybayin script:
  • All other Philippine languages without an ISO-639-2 subtag should use:

Why do we have no lang=”tl-Tglg” subtag code? Because of Suppress-Script: Latn , in the IANA Language Subtag Registry as was shown earlier. If I understood it correctly, it means that the Tagalog language as per the official standard should always be written in the Latin script. I assume then that lang=”tl-Tglg” is wrong and applications have the option to ignore it or drop the “Tglg” script subtag. In this case, just use “fil-Tglg”.

ISO-639-3 Languages

There is another subtag that you should learn if you want to target dialects and macrolanguages. You can find a list from the ISO standard ISO-639-3. Let’s use Bikolano as an example. The format to use is: language-extended_language-Script.

  • If you are writing in Central Bikolano, use:
  • If writing in Albay Bikolano / Buhi-Daraga:
  • If in Iriga Bikolano using the Baybayin script:
  • If in Pandan (Northern Catanduanes) using the Baybayin script:

The language- extended_language -Script is, as of this article, still not yet implemented. The basis for the lang attribute is always the IANA Language Subtag Registry, once it has been updated to include extended_languages then we can start using it where needed.

The phi language subtag

Next is if your language have an ISO-639-3 code and is under or part of the language code “phi” in ISO-639-2, then the phi subtag is to be used. This subtag code is considered a collective language. Good examples are:

  • Kinaray-a language:
  • Maguindanao language:
  • Maranao language written in Baybayin script:
  • Tausug language written in Baybayin script:

As you probably have noticed the format I used was language-Script-variant and not language-extended_language-Script. My reasoning is simple – the phi language code is not really a language, it is accurately called a “collective” language entry in ISO-639-2 for all other Philippine languages not found in this version of the ISO language standard. Compare that to the bik language code, it was clearly marked as a “macrolanguage” both in ISO-639-2 and ISO-639-3.

Additionally, according to the World Wide Web Consortium or W3C, dialects of macrolanguages are considered/should be written immediately after the language subtag. In other words, if your ISO-639-2 code is considered a macrolanguage then you should use the extended_language subtag position like lang=”bik-cts-Tglg”. If it wasn’t defined as a macrolanguage, then you should use the variant subtag position as is the case in lang=”phi-Tglg-tsg”.

Examples, examples, and more examples…

If you website is mainly about Iriga and you write in your own language, then you should adjust your website’s header files accordingly:

<html lang="bik-bto">
    <meta charset="UTF-8" />
    <meta description="Ang Website Ko Sa Iriga Bikolano" />
    <meta keywords="Philippines, Baybayin, Iriga, Bikolano" />
    <title>Ang Website Ko Sa Iriga Bikolano</title>

If you want to write “Happy Father’s Day” in Baybayin, this is how you do it:

<span lang="fil-Tglg">ᜋᜎᜒᜄᜌᜅ᜔ ᜀᜍᜏ᜔ ᜈᜅ᜔ ᜋᜅ ᜀᜋ</span>

Simple? Cool! Just remember that when writing language tags, keep it as simple and as short as possible. If you do not have a need to be very specific like say lang=”bik-bcl” then don’t be! Simply use lang=”bik”. This is especially true for blogs. So if your blog is in Filipino language (!not Tagalog!) then you use:

<html lang="fil">

Only be specific when you need it or when your site caters mainly to that particular audience and/or region. Additionally, if you are going to use (which you probably will) other languages and scripts, enclose it always in a span or div element as I’ve shown in my Baybayin example above.

Easy? Yes it is. It takes time to get used to it, and yes, it is confusing at first. But you will get the hang of it eventually. Go on and update your websites now and start practising marking your content with the correct language and script.

Image source:

Donations for the magus

  • XLM (Stellar Lumens) 🚀🪐17: yukino* XLM (Stellar Lumens) 🚀🪐17: yukino*
    • XLM memo/tag (optional): for
    • Highly preferred
  • ZEC (Zcash) Z0.03: t1W7HusjBAXgquM7YHu6xDUEBejmYPKU2HC ZEC (Zcash) Z0.03: t1W7HusjBAXgquM7YHu6xDUEBejmYPKU2HC
  • XRP (Ripple) X5: rU2mEJSLqBRkYLVTv55rFTgQajkLTnT6mA XRP (Ripple) X5: rU2mEJSLqBRkYLVTv55rFTgQajkLTnT6mA
    • XRP memo/tag (required): 246013
  • STEEM: yahananxie STEEM: yahananxie
  • ETH_smartcontract (Etherium) Ξ0.007: 0x739d2aae2a5b7a4e1d64c58d121c9d908d706c83 ETH_smartcontract (Etherium) Ξ0.007: 0x739d2aae2a5b7a4e1d64c58d121c9d908d706c83
    • Gas: please use at least 35,000
    • Do not send non-smartcontract ΞTH and ERC20 tokens to this address.
  • ETH_ERC20 (Etherium) Ξ0.007: 0xB127362Dc268B63cE22E697344D2c51e673f18B6 ETH_ERC20 (Etherium) Ξ0.007: 0xB127362Dc268B63cE22E697344D2c51e673f18B6
    • Accepts non-smartcontract transactions and ERC20 tokens (in particular: AWC, ENJ, PAX, TUSD, USDC)
  • BCH (Bitcoin cash) ₿CH0.004: pp8fkmchlu6a7c53a2s682jd70mncrzemsthca6ftl BCH (Bitcoin cash) ₿CH0.004: pp8fkmchlu6a7c53a2s682jd70mncrzemsthca6ftl
  • XBT (Bitcoin core) ₿0.0002: 32w1De4wvr5jEzC4g5P4rkjvqg2bvMR8Vk XBT (Bitcoin core) ₿0.0002: 32w1De4wvr5jEzC4g5P4rkjvqg2bvMR8Vk
The LANG Attribute
Article Name
The LANG Attribute
The LANG attribute is a powerful piece of code when designing websites. Here I am going to show you how to correctly use the LANG attribute.

CC BY-SA 4.0 The LANG Attribute by ᜌᜓᜃᜒ (Yuki|雪亮) is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at Legal Notice.

Leave a Reply