What makes Java easier to parse than C? -
What makes Java easier to parse than C? -
i'm acquainted fact the grammars of c , c++ context-sensitive, , in particular need "lexer hack" in c. on other hand, i'm under impression can parse java 2 tokens of look-ahead, despite considerable similarity between 2 languages.
what have alter c create more tractable parse?
i inquire because of examples i've seen of c's context-sensitivity technically allowable awfully weird. example,
foo (a);
could calling void function foo
argument a
. or, declaring a
object of type foo
, rid of parantheses. in part, weirdness occurs because "direct declarator" production rule c grammar fulfills dual purpose of declaring both functions , variables.
on other hand, java grammar has separate production rules variable declaration , function declaration. if write
foo a;
then know it's variable declaration , foo
can unambiguously parsed typename. might not valid code if class foo
hasn't been defined somewhere in current scope, that's job semantic analysis can performed in later compiler pass.
i've seen said c hard parse because of typedef, can declare own types in java too. c grammar rules, besides direct_declarator
, @ fault?
parsing c++ getting hard. parsing java getting hard.
see so reply discussing why c (and c++) "hard" parse. short summary c , c++ grammars inherently ambiguous; give multiple parses , must utilize context resolve ambiguities. people create error of assuming have resolve ambiguities parse; not so, see below. if insist on resolving ambiguities parse, parser gets more complicated , much harder build; complexity self-inflicted wound.
iirc, java 1.4's "obvious" lalr(1) grammar not ambiguous, "easy" parse. i'm not sure modern java hasn't got @ to the lowest degree long distance local ambiguities; there's problem of deciding whether "...>>" closes off 2 templates or "right shift operator". suspect modern java not parse lalr(1) anymore.
but 1 can past parsing problem using strong parsers (or weak parsers , context collection hacks c , c++ front end ends now), both languages. c , c++ have additional complication of having preprocessor; these more complicated in practice look. 1 claim c , c++ parsers hard have be written hand. it isn't true; can build java , c++ parsers fine glr parser generators.
but parsing isn't problem is.
once parse, want ast/parse tree. in practice, need know, every identifier, definition , used ("name , type resolution", sloppily, building symbol tables). turns out lot more work getting parser right, compounded inheritance, interfaces, overloading , templates, , confounded fact semantics written in informal natural language spread across tens hundreds of pages of language standard. c++ bad here. java 7 , 8 getting pretty awful point of view. (and symbol tables aren't need; see bio longer essay on "life after parsing").
most folks struggle pure parsing part (often never finishing; check many, many questions how build working parsers real langauges), don't ever see life after parsing. , folk theorems hard parse , no signal happens after stage.
fixing c++ syntax won't anywhere.
regarding changing c++ syntax: you'll find need patch lot of places take care of variety of local , real ambiguities in c++ grammar. if insist, following list might starting place. contend there no point in doing if not c++ standards committee; if did so, , built compiler using that, nobody sane utilize it. there's much invested in existing c++ applications switch convenience of guys building parsers; besides, pain on , existing parsers work fine.
you may want write own parser. ok, that's fine; don't expect rest of community allow alter language must utilize create easier you. want easier them, , that's utilize language documented , implemented.
java c parsing grammar
Comments
Post a Comment