Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Sanskrit and Pali, especially ṁ #97

Open
sujato opened this issue Aug 6, 2021 · 5 comments
Open

Support for Sanskrit and Pali, especially ṁ #97

sujato opened this issue Aug 6, 2021 · 5 comments
Labels
AL5 would be addressed by extension to Adobe Latin 5 character set

Comments

@sujato
Copy link

sujato commented Aug 6, 2021

The ISO standard for Sanskrit/Pali requires ṁ, which is still missing for Source Serif. There are very few quality fonts for ancient Indian languages, and in the free world, none with Source's qualities.

The blog post for Source Serif 4 indicates that AL-5 support is upcoming, which includes ṁ. Yay! So +1 for this! 👍

Meanwhile, thanks to everyone who has made Source Serif happen. I'm using variable fonts with optical sizing on the web. For free! And it just works! Amazing how far we've come. 🙏

@projectshifter
Copy link

Even though it’s not encoded as a precomposed glyph, Source Serif already supports ṁ. You can use m followed by U+0307 COMBINING DOT ABOVE, which is defined by Unicode to be canonically identical.

@projectshifter
Copy link

image

@sujato
Copy link
Author

sujato commented Sep 22, 2021

Thanks for the help, I didn't realize this.

What i have found is that using the precomposed glyph ṁ it "just works" in HTML, so presumably the browser is clever enough to compose the glyph. However in LuaLaTex that doesn't work; you have to add m + U+0307. Which is okay as a workaround, but still, it'd be nicer without this gotcha.

@frankrolf frankrolf added the AL5 would be addressed by extension to Adobe Latin 5 character set label Nov 1, 2021
@dpk
Copy link

dpk commented Feb 1, 2023

What i have found is that using the precomposed glyph ṁ it "just works" in HTML, so presumably the browser is clever enough to compose the glyph. However in LuaLaTex that doesn't work; you have to add m + U+0307. Which is okay as a workaround, but still, it'd be nicer without this gotcha.

You should be able to use the newunicodechar package to work around this:

\usepackage{newunicodechar}
\newunicodechar{ṁ}{m\char"0307}

(Incidentally, when do you use ṁ? I’m familiar with IAST which uses ṃ for the anusvara)

@sujato
Copy link
Author

sujato commented Feb 17, 2023

You should be able to use the newunicodechar package to work around this:

Thanks for the tip.

(Incidentally, when do you use ṁ? I’m familiar with IAST which uses ṃ for the anusvara)

I run SuttaCentral, which uses ISO 15919. It's technically superior on several grounds, not least in maintaining consistency between multiple Indic languages.

Incidentally, I was just doing some background research today, and I discovered to my surprise that ṁ overdot was, in fact, the recommended character for anusvāra at the Geneva Orientalist Congress of 1894, and therefore is the official IAST form. No idea how everyone started using ṃ underdot!

https://discourse.suttacentral.net/t/it-seems-anusvara-was-represented-by-not-at-the-geneva-congress-of-1894/28164

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AL5 would be addressed by extension to Adobe Latin 5 character set
Projects
None yet
Development

No branches or pull requests

4 participants