Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas on language implementation #4

Open
pyx opened this issue Nov 11, 2016 · 2 comments
Open

Ideas on language implementation #4

pyx opened this issue Nov 11, 2016 · 2 comments

Comments

@pyx
Copy link
Contributor

pyx commented Nov 11, 2016

I have an idea in how to implement this markup language based on my limited understanding of the purpose and structure of it as of right now.

How about instead of building a specific parser, we extend on one of the existing lightweight markup languages?

If I understand the purpose of this language correctly, I think YAML is a good choice to be built upon:

  • there are many mature implementations of parsers, in many languages.
  • support multi-documents in a single source file, I believe that will help, as different sheets can be generated from the same file, for different instruments, for example.
  • existing support to compile to other file formats, for example, with multi-documents support, a pdf with lyrics can be generated as well, without extra coding.
  • the language has constructs to refer to other parts, to avoid repetition, that can be used in, e.g., the chorus part of the song, or repeating bars.
  • have language support (syntax highlighting, etc.) in many editors already.
  • it is also very easy to do the things the other way around, e.g., with object structured as the documents, one can generate syntactically correct YAML(TMD) source from it.

YAML parsers usually load YAML source and generate objects of suitable builtin data structure type in respective language, in the case of python, it will be dictionaries, with possible nesting dictionaries and lists. We can leverage this and define the language construct of TMD as certain named keys, and focus on generating music sheets from those information, that way, not only the language is future-proof, easy to extend (a new construct is just another specified named key), but also save the intense labour in writing the parser ourselves.

In python, the popular choices are PyYAML and ruamel.yaml. I personally prefer the latter one, it is a fork of the former with upgraded language spec. (YAML 1.2 instead of YAML 1.1), but to each their own.

The drawback I can think of it is that YAML is way more feature-ful than what TMD needs, especially the possible arbitrary code execution when constructing custom objects, so that feature should be turned off, e.g., by using .safe_load or something similar in those parsers.

@pyx pyx changed the title Ideas in language implementation Ideas on language implementation Nov 11, 2016
@aguai
Copy link
Owner

aguai commented Nov 11, 2016

I am not sure if it gets easier...
in the beginning the idea is to modified(or let say "extend") markdown language...
but

Part:Executer@[Moment]{
<timebase*>
''' things to do '''
}

with

 ->part1->part2->#

seems to be the only two things new to other markup languages (rst, md...,etc.)
and that is in fact quite close to json format..
yaml looks good as you say, maybe we should make *.tmd into a Intermediate language like *.yaml and than compile to any other format I need?(pdf, mid, cue, html, etc...)
but I guess that won't be any easier...(maybe a little?)
anyway let me have the basic notations feature done first!!
the bands are waiting...
(actually there is a bet if I can get it done before Xmas show rehearsal...)

@pyx
Copy link
Contributor Author

pyx commented Nov 14, 2016

reStructuredText is not easy to extend, you have to work on the AST (or in this case, DOM might be a more appropriate term), defining custom directive. Markdown is too bare-bone.
And both does not work well in dumping structured data as a simple object, at which JSON excels, but JSON is only human-reading friendly and inconvenient to write manually (quote every key and string, limited data types, no comments allowed, rigid structure, etc.)

That's why I suggested YAML (but it is just my thought, by no means it is the best choice).

By dumping the structured documents into a simple object, we can avoid the parsing part altogether, at least the whole document structure part. Imagining something like (pseudo code):

def load_tmd(filename):
    dom = yaml.safe_load(filename)
    # do some validation, checking presence of required fields such as magic number,
    # TMD version, must-have fields, etc. if okay, then...
    return dom 

def generate_music_sheet(tmd):
    canvas = create_pdf()
    for part in tmd['parts']:
        draw_staff(canvas)
        draw_notes(canvas, part)
    return canvas


def generate_lyrics(tmd):
    canvas = create_pdf()
    for verse in tmd['lyrics']:
        draw_section_background(canvas)
        draw_paragraph(canvas, verse)
    return canvas


def process(*filenames):
    for filename in filenames:
        dom = load_tmd(filename)
        c = generate_music_sheet(dom)
        save_as_pdf(filename+'.sheet.pdf', c)
        c = generate_lyrics(tmd)
        save_as_pdf(filename+'.lyrics.pdf', c)
        ...

I believe this is actually easier comparing with writing the lexer/parser/compiler ourselves.

Anyway, good look in your bet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants