diff options
author | Leo Tenenbaum <pommicket@gmail.com> | 2020-01-02 18:35:17 -0500 |
---|---|---|
committer | Leo Tenenbaum <pommicket@gmail.com> | 2020-01-02 18:35:17 -0500 |
commit | 43408ef4df909452c7a10992daff911bab97040d (patch) | |
tree | b61784acef5b0700f9d6917f472ac8fdde32e490 | |
parent | df5ba27d799df6700cd9a8c9d94f88b7f2623fa3 (diff) |
fixed goto problem when compiling as c++
-rw-r--r-- | README.html | 35 | ||||
-rw-r--r-- | README.md | 29 | ||||
-rw-r--r-- | binfile.c | 51 | ||||
-rwxr-xr-x | docs.sh | 3 | ||||
-rw-r--r-- | eval.c | 1 | ||||
-rw-r--r-- | parse.c | 112 | ||||
-rw-r--r-- | test.toc | 2 | ||||
-rw-r--r-- | toc.c | 1 | ||||
-rw-r--r-- | types.c | 27 | ||||
-rw-r--r-- | types.h | 7 |
10 files changed, 121 insertions, 147 deletions
diff --git a/README.html b/README.html index 2f08d15..b64d533 100644 --- a/README.html +++ b/README.html @@ -30,16 +30,15 @@ it is nearly as fast in theory.</p> <p>To compile the compiler on a Unix-y system, just run <code>./build.sh release</code>. You can supply a compiler by running <code>CC=tcc ./build.sh release</code>, or build it in debug mode without the <code>release</code>.</p> -<p>On other systems, you can just compile main.c with a C compiler. <code>toc</code> uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including <code>clang</code>, <code>gcc</code>, and <code>tcc</code>. It can also be compiled as if it were C++, but it does break the standard in a few places*. So, MSVC can also compile it. The <em>outputted</em> code should be C99-compliant.</p> +<p>On other systems, you can just compile main.c with a C compiler. <code>toc</code> uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including <code>clang</code>, <code>gcc</code>, and <code>tcc</code>. It can also be compiled as if it were C++, so, MSVC and <code>g++</code> can also compile it (it does rely on implicit casting of <code>void *</code> though). The <em>outputted</em> code should be C99-compliant.</p> <h4>Why it compiles to C</h4> -<p><code>toc</code> compiles to C for three reasons:</p> +<p><code>toc</code> compiles to C. Here are some reasons why:</p> <ul> <li>Speed. C is one of the most performant programming languages out there. It also has compilers which are very good at optimizing (better than anything I could write).</li> <li>Portability. C is probably the most portable language. It has existed for >30 years and can run on practically anything. Furthermore, all major languages nowadays can call functions written in C.</li> -<li>Laziness. I don’t really want to deal with writing something which outputs machine code, and it would certainly be more buggy than something which outputs C.</li> </ul> @@ -69,8 +68,6 @@ it is nearly as fast in theory.</p> </ul> -<p>The last three of those could all be removed fairly easily (assuming the system actually has 8-, 16-, 32-, and 64-bit signed and unsigned types).</p> - <p>And here are all of its C11 features:</p> <ul> @@ -104,31 +101,3 @@ it is nearly as fast in theory.</p> <p>If you find a bug, you can report it through <a href="https://github.com/pommicket/toc/issues">GitHub’s issue tracker</a>, or by emailing pommicket@gmail.com.</p> <p>Just send me the <code>toc</code> source code which results in the bug, and I’ll try to fix it.</p> - -<hr /> - -<p>* for those curious, it has to do with <code>goto</code>. In C, this program:</p> - -<pre><code> -int main() { - goto label; - int x = 5; - label: - return 0; -} -</code></pre> - - -<p>Is completely fine. <code>x</code> will hold an unspecified value after the jump (but it isn’t used so it doesn’t really matter). Apparently, in C++, this is an ill-formed program. This is a bit ridiculous since</p> - -<pre><code> -int main() { - goto label; - int x; x = 5; - label: - return 0; -} -</code></pre> - - -<p>is fine. So that’s an interesting little “fun fact”: <code>int x = 5;</code> isn’t always the same as <code>int x; x = 5;</code> in C++.</p> @@ -30,15 +30,14 @@ See `docs` for more information (in progress). To compile the compiler on a Unix-y system, just run `./build.sh release`. You can supply a compiler by running `CC=tcc ./build.sh release`, or build it in debug mode without the `release`. -On other systems, you can just compile main.c with a C compiler. `toc` uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including `clang`, `gcc`, and `tcc`. It can also be compiled as if it were C++, but it does break the standard in a few places\*. So, MSVC can also compile it. The *outputted* code should be C99-compliant. +On other systems, you can just compile main.c with a C compiler. `toc` uses several C99 and a couple of C11 features, so it might not work on all compilers. But it does compile on quite a few, including `clang`, `gcc`, and `tcc`. It can also be compiled as if it were C++, so, MSVC and `g++` can also compile it (it does rely on implicit casting of `void *` though). The *outputted* code should be C99-compliant. #### Why it compiles to C -`toc` compiles to C for three reasons: +`toc` compiles to C. Here are some reasons why: - Speed. C is one of the most performant programming languages out there. It also has compilers which are very good at optimizing (better than anything I could write). - Portability. C is probably the most portable language. It has existed for >30 years and can run on practically anything. Furthermore, all major languages nowadays can call functions written in C. -- Laziness. I don't really want to deal with writing something which outputs machine code, and it would certainly be more buggy than something which outputs C. --- @@ -60,8 +59,6 @@ Here are all the C99 features which `toc` depends on (I might have forgotten som - Non-constant struct literal initializers (e.g. `int x[2] = {y, z};`) - Flexible array members -The last three of those could all be removed fairly easily (assuming the system actually has 8-, 16-, 32-, and 64-bit signed and unsigned types). - And here are all of its C11 features: - Anonymous structures/unions @@ -91,25 +88,3 @@ Here are the major versions of `toc`. If you find a bug, you can report it through [GitHub's issue tracker](https://github.com/pommicket/toc/issues), or by emailing pommicket@gmail.com. Just send me the `toc` source code which results in the bug, and I'll try to fix it. - ---- - -\* for those curious, it has to do with `goto`. In C, this program: -<pre><code> -int main() { - goto label; - int x = 5; - label: - return 0; -} -</code></pre> -Is completely fine. `x` will hold an unspecified value after the jump (but it isn't used so it doesn't really matter). Apparently, in C++, this is an ill-formed program. This is a bit ridiculous since -<pre><code> -int main() { - goto label; - int x; x = 5; - label: - return 0; -} -</code></pre> -is fine. So that's an interesting little "fun fact": `int x = 5;` isn't always the same as `int x; x = 5;` in C++. @@ -1,9 +1,25 @@ -/* #define BINFILE_PORTABLE 1 */ +#define BINFILE_PORTABLE 1 + +#ifdef TOC_DEBUG +#define BINFILE_PRINT +#endif static inline void write_u8(FILE *fp, U8 u8) { putc(u8, fp); +#ifdef BINFILE_PRINT + static int col = 0; + printf("%02x ", u8); + ++col; + if (col == 8) printf(" "); + if (col == 16) { + col = 0; + printf("\n"); + } +#endif } +#undef BINFILE_PRINT /* don't need it anymore */ + static inline void write_i8(FILE *fp, I8 i8) { write_u8(fp, (U8)i8); } @@ -14,8 +30,8 @@ static inline void write_i8(FILE *fp, I8 i8) { */ static inline void write_u16(FILE *fp, U16 u16) { - putc(u16 & 0xFF, fp); - putc(u16 >> 8, fp); + write_u8(fp, (U8)(u16 & 0xFF)); + write_u8(fp, (U8)(u16 >> 8)); } static inline void write_i16(FILE *fp, I16 i16) { @@ -76,14 +92,14 @@ static void write_f32(FILE *fp, F32 f32) { f32 *= (F32)2; fraction_bit >>= 1; } - putc(fraction & 0xFF, fp); - putc((fraction & 0xFF00) >> 8, fp); + write_u8(fp, fraction & 0xFF); + write_u8(fp, (fraction & 0xFF00) >> 8); unsigned byte3 = (fraction & 0x7F0000) >> 16; byte3 |= (exponent & 1) << 7; - putc((int)byte3, fp); + write_u8(fp, (U8)byte3); unsigned byte4 = exponent >> 1; byte4 |= (sign << 7); - putc((int)byte4, fp); + write_u8(fp, (U8)byte4); #else fwrite(&f32, sizeof f32, 1, fp); #endif @@ -115,28 +131,27 @@ static void write_f64(FILE *fp, F64 f64) { f64 *= (F64)2; fraction_bit >>= 1; } - printf("%lu\n",fraction); - putc(fraction & 0xFF, fp); - putc((fraction & 0xFF00) >> 8, fp); - putc((fraction & 0xFF0000) >> 16, fp); - putc((fraction & 0xFF000000) >> 24, fp); - putc((fraction & 0xFF00000000) >> 32, fp); - putc((fraction & 0xFF0000000000) >> 40, fp); + write_u8(fp, fraction & 0xFF); + write_u8(fp, (fraction & 0xFF00) >> 8); + write_u8(fp, (fraction & 0xFF0000) >> 16); + write_u8(fp, (fraction & 0xFF000000) >> 24); + write_u8(fp, (fraction & 0xFF00000000) >> 32); + write_u8(fp, (fraction & 0xFF0000000000) >> 40); unsigned byte7 = (fraction & 0xF000000000000) >> 48; byte7 |= (exponent & 0xF) << 4; - putc((int)byte7, fp); + write_u8(fp, (U8)byte7); unsigned byte8 = (exponent & 0x7F0) >> 4; byte8 |= (sign << 7); - putc((int)byte8, fp); + write_u8(fp, (U8)byte8); #else fwrite(&f64, sizeof f64, 1, fp); #endif } static void write_bool(FILE *fp, bool b) { - putc(b, fp); + write_u8(fp, b); } static void write_char(FILE *fp, char c) { - putc(c, fp); + write_u8(fp, (U8)c); } @@ -1,6 +1,7 @@ #!/bin/sh markdown README.md > README.html +echo README.md for x in docs/*.md; do - echo $x markdown $x > $(dirname $x)/$(basename $x .md).html + echo $x done @@ -23,7 +23,6 @@ static void evalr_create(Evaluator *ev, Typer *tr, Allocator *allocr) { static void evalr_free(Evaluator *ev) { typedef void *VoidPtr; arr_foreach(ev->to_free, VoidPtr, f) { - printf("Freeing %p\n",*f); free(*f); } arr_clear(&ev->to_free); @@ -1789,69 +1789,73 @@ static bool parse_decl(Parser *p, Declaration *d, DeclEndKind ends_with, U16 fla goto ret_false; } - bool annotates_type = !token_is_kw(t->token, KW_EQ) && !token_is_kw(t->token, KW_COMMA); - if (annotates_type) { - d->flags |= DECL_ANNOTATES_TYPE; - Type type; - if (!parse_type(p, &type)) { - goto ret_false; - } - d->type = type; - if (type.kind == TYPE_TUPLE && arr_len(d->type.tuple) != arr_len(d->idents)) { - err_print(d->where, "Expected to have %lu things declared in declaration, but got %lu.", (unsigned long)arr_len(d->type.tuple), (unsigned long)arr_len(d->idents)); - goto ret_false; - } - } - const char *end_str = NULL; - switch (ends_with) { - case DECL_END_SEMICOLON: end_str = "';'"; break; - case DECL_END_RPAREN_COMMA: end_str = "')' or ','"; break; - case DECL_END_LBRACE_COMMA: end_str = "'{' or ','"; break; - } - - if (token_is_kw(t->token, KW_EQ)) { - ++t->token; - if ((flags & PARSE_DECL_ALLOW_INFER) && ends_decl(t->token, ends_with)) { - /* inferred expression */ - d->flags |= DECL_INFER; - if (arr_len(d->idents) > 1) { - err_print(d->where, "Inferred declarations can only have one identifier. Please separate this declaration."); + { + bool annotates_type = !token_is_kw(t->token, KW_EQ) && !token_is_kw(t->token, KW_COMMA); + if (annotates_type) { + d->flags |= DECL_ANNOTATES_TYPE; + Type type; + if (!parse_type(p, &type)) { goto ret_false; } - if (!(d->flags & DECL_IS_CONST)) { - tokr_err(t, "Inferred parameters must be constant."); + d->type = type; + if (type.kind == TYPE_TUPLE && arr_len(d->type.tuple) != arr_len(d->idents)) { + err_print(d->where, "Expected to have %lu things declared in declaration, but got %lu.", (unsigned long)arr_len(d->type.tuple), (unsigned long)arr_len(d->idents)); goto ret_false; } + } + } + { + const char *end_str = NULL; + switch (ends_with) { + case DECL_END_SEMICOLON: end_str = "';'"; break; + case DECL_END_RPAREN_COMMA: end_str = "')' or ','"; break; + case DECL_END_LBRACE_COMMA: end_str = "'{' or ','"; break; + } + + if (token_is_kw(t->token, KW_EQ)) { ++t->token; - } else { - d->flags |= DECL_HAS_EXPR; - uint16_t expr_flags = 0; - if (ends_with == DECL_END_RPAREN_COMMA) - expr_flags |= EXPR_CAN_END_WITH_COMMA; - if (ends_with == DECL_END_LBRACE_COMMA) - expr_flags |= EXPR_CAN_END_WITH_LBRACE; - Token *end = expr_find_end(p, expr_flags); - if (!end || !ends_decl(end, ends_with)) { - t->token = end; - tokr_err(t, "Expected %s at end of declaration.", end_str); - goto ret_false; - } - if (!parse_expr(p, &d->expr, end)) { - t->token = end; /* move to ; */ - goto ret_false; - } - if (ends_decl(t->token, ends_with)) { + if ((flags & PARSE_DECL_ALLOW_INFER) && ends_decl(t->token, ends_with)) { + /* inferred expression */ + d->flags |= DECL_INFER; + if (arr_len(d->idents) > 1) { + err_print(d->where, "Inferred declarations can only have one identifier. Please separate this declaration."); + goto ret_false; + } + if (!(d->flags & DECL_IS_CONST)) { + tokr_err(t, "Inferred parameters must be constant."); + goto ret_false; + } ++t->token; } else { - tokr_err(t, "Expected %s at end of declaration.", end_str); - goto ret_false; + d->flags |= DECL_HAS_EXPR; + uint16_t expr_flags = 0; + if (ends_with == DECL_END_RPAREN_COMMA) + expr_flags |= EXPR_CAN_END_WITH_COMMA; + if (ends_with == DECL_END_LBRACE_COMMA) + expr_flags |= EXPR_CAN_END_WITH_LBRACE; + Token *end = expr_find_end(p, expr_flags); + if (!end || !ends_decl(end, ends_with)) { + t->token = end; + tokr_err(t, "Expected %s at end of declaration.", end_str); + goto ret_false; + } + if (!parse_expr(p, &d->expr, end)) { + t->token = end; /* move to ; */ + goto ret_false; + } + if (ends_decl(t->token, ends_with)) { + ++t->token; + } else { + tokr_err(t, "Expected %s at end of declaration.", end_str); + goto ret_false; + } } + } else if (ends_decl(t->token, ends_with)) { + ++t->token; + } else { + tokr_err(t, "Expected %s or '=' at end of delaration.", end_str); + goto ret_false; } - } else if (ends_decl(t->token, ends_with)) { - ++t->token; - } else { - tokr_err(t, "Expected %s or '=' at end of delaration.", end_str); - goto ret_false; } if ((d->flags & DECL_IS_CONST) && !(d->flags & DECL_HAS_EXPR) && !(flags & PARSE_DECL_ALLOW_CONST_WITH_NO_EXPR)) { @@ -1,4 +1,4 @@ #export foo :: f64 = 0.07321; // asdf, dsajkhf, sadjkfh ::= 5; // #export asdf ::= "asdfasdfasdf"; -#export zla := foo;
\ No newline at end of file +#export zla, asdf := foo, 9;
\ No newline at end of file @@ -16,7 +16,6 @@ #include <string.h> #include <limits.h> #include <float.h> -#include <stdbool.h> #include <inttypes.h> #include "types.h" @@ -664,6 +664,9 @@ static bool types_fn(Typer *tr, FnExpr *f, Type *t, Location where, FnExpr *prev_fn = tr->fn; bool success = true; bool entered_fn = false; + Expression *ret_expr; + Type *ret_type; + bool has_named_ret_vals; assert(t->kind == TYPE_FN); if (instance) { f = &instance->fn; @@ -682,9 +685,9 @@ static bool types_fn(Typer *tr, FnExpr *f, Type *t, Location where, success = false; goto ret; } - Expression *ret_expr = f->body.ret_expr; - Type *ret_type = t->fn.types; - bool has_named_ret_vals = f->ret_decls != NULL; + ret_expr = f->body.ret_expr; + ret_type = t->fn.types; + has_named_ret_vals = f->ret_decls != NULL; if (ret_expr) { if (!type_eq(ret_type, &ret_expr->type)) { char *got = type_to_str(&ret_expr->type); @@ -1724,7 +1727,7 @@ static bool types_expr(Typer *tr, Expression *e) { lhs_type = lhs_type->ptr; if (lhs_type->kind != TYPE_STRUCT) break; /* fallthrough */ - case TYPE_STRUCT: + case TYPE_STRUCT: { /* allow accessing struct members with a string */ if (rhs_type->kind != TYPE_SLICE || !type_is_builtin(rhs_type->slice, BUILTIN_CHAR)) { @@ -1754,7 +1757,7 @@ static bool types_expr(Typer *tr, Expression *e) { free(fstr); free(typestr); return false; } - break; + } break; default: { char *s = type_to_str(lhs_type); err_print(e->where, "Trying to take index of non-array type %s.", s); @@ -2015,12 +2018,14 @@ static bool types_decl(Typer *tr, Declaration *d) { } } - size_t n_idents = arr_len(d->idents); - if (d->type.kind == TYPE_TUPLE) { - if (n_idents != arr_len(d->type.tuple)) { - err_print(d->where, "Expected to have %lu things declared in declaration, but got %lu.", (unsigned long)arr_len(d->type.tuple), (unsigned long)n_idents); - success = false; - goto ret; + { + size_t n_idents = arr_len(d->idents); + if (d->type.kind == TYPE_TUPLE) { + if (n_idents != arr_len(d->type.tuple)) { + err_print(d->where, "Expected to have %lu things declared in declaration, but got %lu.", (unsigned long)arr_len(d->type.tuple), (unsigned long)n_idents); + success = false; + goto ret; + } } } ret: @@ -31,6 +31,13 @@ typedef uint32_t U32; typedef uint64_t U64; #define U64_MAX UINT64_MAX +#if __STDC_VERSION__ >= 199901 +#include <stdbool.h> +#else +typedef U8 bool; +#endif + + typedef int8_t I8; #define I8_MAX INT8_MAX typedef int16_t I16; |